ACAT 2022

Name: ACAT 2022
Start: 2022-10-23T16:30:00+02:00
End: 2022-10-28T17:00:00+02:00
Location: Villa Romanazzi Carducci, Bari, Italy

23–28 Oct 2022

Villa Romanazzi Carducci, Bari, Italy

Europe/Rome timezone

Contact

acat-loc2022@cern.ch

Stability of the CMS Submission Infrastructure for the LHC Run 3

24 Oct 2022, 11:00

30m

Area Poster (Floor -1) (Villa Romanazzi)

Area Poster (Floor -1)

Villa Romanazzi

Poster Track 1: Computing Technology for Physics Research Poster session with coffee break

Antonio Perez-Calero Yzquierdo (Centro de Investigaciones Energéticas Medioambientales y Tecnológicas)

The CMS Submission Infrastructure is the main computing resource provisioning system for CMS workflows, including data processing, simulation and analysis. It currently aggregates nearly 400k CPU cores distributed worldwide from Grid, HPC and cloud providers. CMS Tier-0 tasks, such as data repacking and prompt reconstruction, critical for data-taking operations, are executed on a collection of computing resources at CERN, also managed by the CMS Submission Infrastructure.

All this computing power is harnessed via a number of federated resource pools, supervised by HTCondor and GlideinWMS services. Elements such as pilot factories, job schedulers and connection brokers are deployed in HA mode across several “availability zones”, providing stability to our services via hardware redundancy and numerous failover mechanisms.

Given the upcoming start of the LHC Run 3, the Submission Infrastructure stability has been recently tested in a series of controlled exercises, performed without interruption of our services. These tests have demonstrated the resilience of our systems, and additionally provided useful information in order to further refine our monitoring and alarming system.

This contribution will describe the main elements in the CMS Submission Infrastructure design and deployment, along with the performed failover exercises, proving that our systems are ready to serve their critical role in support of CMS activities.

Significance

This presentation will cover how the CMS Submission Infrastructure (SI) has been designed and set up to avoid single points of failure, along with the tests performed in order to verify its resilience and stability, considering that the SI plays a critical role in the capability of the CMS experiment's Tier-0 node to take and process collisions data.

Experiment context, if any	The CMS experiment at the LHC at CERN

Antonio Perez-Calero Yzquierdo (Centro de Investigaciones Energéticas Medioambientales y Tecnológicas)

Edita Kizinevic (CERN) Farrukh Aftab Khan (Fermi National Accelerator Lab. (US)) Hyunwoo Kim (Fermi National Accelerator Lab. (US)) Marco Mascheroni (Univ. of California San Diego (US)) Maria Acosta Flechas (Fermi National Accelerator Lab. (US)) Saqib Haleem (National Centre for Physics (PK))

20221024_CMSSI_Stability_poster.pdf

ACAT_paper_SI_Stability_Submitted.pdf

ACAT 2022

Contact

Stability of the CMS Submission Infrastructure for the LHC Run 3

Area Poster (Floor -1)

Villa Romanazzi

Speaker

Description

Significance

Author

Co-authors

Presentation materials

Peer reviewing

Paper

Choose timezone

ACAT 2022

Contact

Speaker

Description

Significance

Author

Co-authors

Presentation materials

Peer reviewing

Paper