25–29 May 2026
Chulalongkorn University
Asia/Bangkok timezone

Enhancing the use of the Calibration and Conditions Database in ALICE Grid jobs

28 May 2026, 17:09
18m
Chulalongkorn University

Chulalongkorn University

Oral Presentation Track 1 - Data and metadata organization, management and access Track 1 - Data and metadata organization, management and access

Speaker

Martin Øines Eide (Western Norway University of Applied Sciences (NO))

Description

Authors:
- Martin Øines Eide, Western Norway University of Applied Sciences,
University of Bergen, Bergen, Norway and European Organization for
Nuclear Research (CERN), Geneva, Switzerland
- Costin Grigoras, European Organization for Nuclear Research (CERN), Geneva,
Switzerland
on behalf of the ALICE collaboration


The ALICE experiment at CERN relies on a central service known as the Calibration and Conditions Database (CCDB).This service acts as a single, uniform source of data essential for online and offline reconstruction, analysis, and other crucial tasks within the experiment. Currently, the CCDB is fully operational and has successfully managed a heavy workload, serving thousands of requests per second across the online and the distributed offline Grid environment. Due to the centralized nature of the CCDB service combined with the distributed execution of ALICE Grid jobs, connectivity is a significant concern - jobs occasionally encounter connectivity issues when attempting to access the CCDB. Furthermore, the practice of redundant lookups, where multiple jobs or even the same job repeatedly request identical pieces of calibration or conditions data, imposes an unnecessary load on the central service. To mitigate these operational challenges, the ALICE team is actively investigating and implementing a caching solution.

This work details the specific technical improvements made to the CCDB usage tracking and analysis mechanisms, which were necessary to properly characterize the service's workload and optimize the caching strategy. In particular, to maintain the reliability and responsiveness of the CCDB in the face of immense Grid job traffic, rigorous connection monitoring tracking key network and database metrics, such as the latency for establishing a connection, the duration of open connections, and the frequency of connection timeouts or failures experienced by distributed Grid jobs was implemented. By closely monitoring these parameters, the system can identify and flag specific regions or job types prone to connectivity issues, allowing for targeted network or service adjustments. This detailed monitoring directly informs the assessment of database performance and the choice of the caching solution itself, along with its architecture and successful integration into the ALICE Grid middleware, JAliEn.

Author

Martin Øines Eide (Western Norway University of Applied Sciences (NO))

Co-author

Presentation materials

There are no materials yet.