Conditions data (for example: alignment, calibration, data quality) are used extensively in the processing of real and simulated data in ATLAS. The volume and variety of the conditions data needed by different types of processing are quite diverse, so optimizing its access requires a careful understanding of conditions usage patterns. These patterns can be quantified by mining representative log files from each type of processing and gathering detailed information about conditions usage for that type of processing into a central repository.
In this presentation, we describe the systems developed to collect this conditions usage metadata per job type and describe a few specific (but very different) ways in which it has been used. For example, it can be used to cull specific conditions data into a much more compact package to be used by jobs doing similar types of processing: these customized collections can then be shipped with jobs to be executed on isolated worker nodes (such as HPC farms) that have no network access to conditions. Another usage is in the design of future ATLAS software: to provide Run 3 software developers essential information about the nature of current conditions accessed by software. This helps to optimize internal handling of conditions data to minimize its memory footprint while facilitating access to this data by the sub-processes that need it.
|Secondary Keyword (Optional)||Distributed data handling|
|Primary Keyword (Mandatory)||Databases|
|Tertiary Keyword (Optional)||Software development process and tools|