19–25 Oct 2024
Europe/Zurich timezone

Recent Experience with the CMS Data Management System

23 Oct 2024, 13:30
18m
Room 1.B (Medium Hall B)

Room 1.B (Medium Hall B)

Talk Track 1 - Data and Metadata Organization, Management and Access Parallel (Track 1)

Speaker

Hasan Ozturk (CERN)

Description

The CMS experiment manages a large-scale data infrastructure, currently handling over 200 PB of disk and 500 PB of tape storage and transferring more than 1 PB of data per day on average between various WLCG sites. Utilizing Rucio for high-level data management, FTS for data transfers, and a variety of storage and network technologies at the sites, CMS confronts inevitable challenges due to the system’s growing scale and evolving nature. Key challenges include managing transfer and storage failures, optimizing data distribution across different storages based on production and analysis needs, implementing necessary technology upgrades and migrations, and efficiently handling user requests. The data management team has established comprehensive monitoring to supervise this system and has successfully addressed many of these challenges. The team’s efforts aim to ensure data availability and protection, minimize failures and manual interventions, maximize transfer throughput and resource utilization, and provide reliable user support. This paper details the operational experience of CMS with its data management system in recent years, focusing on the encountered challenges, the effective strategies employed to overcome them and the ongoing challenges as we prepare for future demands.

Primary authors

Andres Manrique Ardila (University of Wisconsin Madison (US)) Andrew Wightman (University of Nebraska Lincoln (US)) Christos Emmanouil (CERN) Dmytro Kovalskyi (Massachusetts Inst. of Technology (US)) Eric Vaandering (Fermi National Accelerator Lab. (US)) Hasan Ozturk (CERN) Panos Paparrigopoulos (CERN) Rahul Chauhan (CERN)

Presentation materials