More than one thousand physicists analyse data collected by the ATLAS experiment at the Large Hadron Collider (LHC) at CERN through 150 computing facilities around the world. Efficient distributed analysis requires optimal resource usage and the interplay of several
factors: robust grid and software infrastructures, and system capability to adapt to different workloads. The continuous automatic validation of
grid sites and the user support provided by a dedicated team of expert shifters have been proven to provide a solid distributed analysis system for ATLAS users.
Based on the experience from the first run of the LHC, substantial improvements to the ATLAS computing system have been made to optimize both production and analysis workflows. These include the re-design of the production and data management systems, a new analysis data format and event model, and the development of common reduction and analysis frameworks. The impact of such changes on the distributed analysis system is evaluated. More than 100 million user jobs in the period 2015-2016 are analysed for the first time with analytics tools such as Elastic Search. Typical user workflows and their associated metrics are studied and the improvement in the usage of distributed resources due to the common analysis data format and the reduction framework is assessed. Measurements of user job performance and typical requirements are also shown.
|Secondary Keyword (Optional)||Data processing workflows and frameworks/pipelines|
|Primary Keyword (Mandatory)||Computing models|
|Tertiary Keyword (Optional)||Distributed data handling|