Development of a Next Generation Concurrent Framework for the ATLAS Experiment

14 Apr 2015, 14:45
15m
Auditorium (Auditorium)

Auditorium

Auditorium

oral presentation Track2: Offline software Track 2 Session

Speaker

Charles Leggett (Lawrence Berkeley National Lab. (US))

Description

The ATLAS experiment has successfully used its Gaudi/Athena software framework for data taking and analysis during the first LHC run, with billions of events successfully processed. However, the design of Gaudi/Athena dates from early 2000 and the software and the physics code has been written using a single threaded, serial design. This programming model has increasing difficulty in exploiting the potential of current CPUs, which offer their best performance only through taking full advantage of multiple cores and wide vector registers. Future CPU evolution will intensify this trend, with core counts increasing and memory per core falling. With current memory consumption for 64 bit ATLAS reconstruction in a high luminosity environment approaching 4GB, it will become impossible to fully occupy all cores in a machine without exhausting available memory. However, since maximising performance per watt will be a key metric, a mechanism must be found to use all cores as efficiently as possible. In this paper we report on our progress with a practical demonstration of the use of multi-threading in the ATLAS reconstruction software, using the GaudiHive framework. We have expanded support to Calorimeter, Inner Detector, and Tracking code, discussing what changes were necessary in order to allow the serially designed ATLAS code to run, both to the framework and to the tools and algorithms used. We report on both the performance gains, and what general lessons were learned about the code patterns that had been employed in the software and which patterns were identified as particularly problematic for multi-threading. We also present our findings on implementing a hybrid multi-threaded / multi-process framework, to take advantage of the strengths of each type of concurrency, while avoiding some of their corresponding limitations.

Primary author

Charles Leggett (Lawrence Berkeley National Lab. (US))

Co-authors

Benjamin Michael Wynne (University of Edinburgh (GB)) Dr David Malon (Argonne National Laboratory (US)) Dr Graeme Stewart (University of Glasgow (GB)) Paolo Calafiura (Lawrence Berkeley National Lab. (US)) Walter Lampl (University of Arizona (US))

Presentation materials