The aim of this presentation is to give an overview of the foreseen database needs related to accelerator operation in the coming years.
The ATLAS Distributed Computing (ADC) project delivers production tools and services for ATLAS offline activities such as data placement and data processing on the Grid. The system has been capable of sustaining with high efficiency the needed computing activities during the first and in the ongoing second run of LHC data taking.
Databases are a vital part of the whole ADC system. The Oracle...
In the future of ATLAS the event should be the atomic information for metadata. Events could be either from data or from Monte Carlo, and different "representation" of the event would be available to reflect the processing stage under consideration. Eg Raw, Analysis Object Data (AOD), or derived AOD. The metadata should carry the provenance information of the event, as well as the logical...
Conditions data are in general non-event data varying with time. A particular subset is critical for physics data processing (detector status and configuration, run information, detector calibration and alignment,…). Part of these data is used instead for monitoring of the detector.
Atlas has been using the COOL framework as a generic abstraction layer to deal with conditions data during Run1...
"The CMS experiment relies on Relational Databases to store essential data for the most important production operations. Several subsystems critical for data taking, data processing and daily operation have been designed and optimised for a Relational storage. Their variety in terms of architecture and complexity, and the specific needs of the experiment organisation has required the...
Conditions data infrastructure for both ATLAS and CMS have to deal with the management of several Terabytes of data. Distributed computing access to this data requires particular care and attention to manage request-rates of up to several tens of kHz. Thanks to the large overlap in use cases and requirements, ATLAS and CMS have worked towards a common solution for conditions data management...
Relational databases are critical backend storage for many systems in ATLAS (both online, offline, and on the grid), storing essential data for the processing of past and current data as well as support daily operations. These systems have been refined over time into robust applications optimized and provisioned for established use cases. Relational storage is well suited for many of these...
The CERN Accelerator Logging Service (CALS) was designed in 2001, and has been in production for 14 years. It is a mission-critical service for the operation of the LHC (Large Hadron Collider).
CALS uses an Oracle database for storage of technical accelerator data and persists approx 0.75 petabytes of data coming from more than 1.5 million pre-defined signals. These signals represent data...
Experimental Particle Physics has been at the forefront of analyzing the world’s largest datasets for decades. The HEP community was among the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems for distributed data processing, collectively called “Big Data” technologies have emerged from industry and open source projects to support...
The Post Mortem was designed almost a decade ago to enable the collection and the analysis of high-resolution, transient data recordings of relevant events, such as beam dumps in the LHC accelerator. Since then, the storage has been constantly evolving both to accommodate larger datasets and to satisfy new requirements and use-cases for the LHC but also first machines in the injector complex....
The ATLAS EventIndex was designed during LS2 to satisfy a small but important number of use cases, primarily event picking. Its contents and storage architecture were tailored to the primary use cases, favouring performance and robustness over the possibility to expand its scope. The EventIndex is in operation since the start of Run2 and shows satisfactory performance for event picking and...
The ATLAS Analytics effort is focused on creating systems which provide ATLAS Distributed Computing (ADC) with new capabilities for understanding distributed systems and overall operational performance. These capabilities include to correlate information from multiple systems (PanDA, Rucio, FTS, Dashboards, Tier0, PilotFactory, ...), predictive analytics to execute arbitrary data mining or...
We introduce a first working implementation of a distributed object store along with a network cache for distribution of information in the ATLAS Trigger and Data Acquisition (TDAQ) system primarily during system online configuration. The TDAQ system of the ATLAS detector at the Large Hadron Collider at CERN is a large distributed system at a range of a few tens thousand processes and servers...
"The CMS experiment relies on Relational Databases to store essential data for the most important production operations. Several subsystems critical for data taking, data processing and daily operation have been designed and optimised for a Relational storage. Their variety in terms of architecture and complexity, and the specific needs of the experiment organisation has required the...