11–15 Mar 2024
Charles B. Wang Center, Stony Brook University
US/Eastern timezone

Deployment of ATLAS Calorimeter Fast Simulation Training Through Container Technology

11 Mar 2024, 16:15
30m
Charles B. Wang Center, Stony Brook University

Charles B. Wang Center, Stony Brook University

100 Circle Rd, Stony Brook, NY 11794
Poster Track 1: Computing Technology for Physics Research Poster session with coffee break

Speakers

Joshua Falco Beirer (Georg August Universitaet Goettingen (DE)) Joshua Falco Beirer (CERN)

Description

Simulation of the detector response is a major computational challenge in modern High Energy Physics experiments, as for example it accounts for about two fifths of the total ATLAS computing resources. Among simulation tasks, calorimeter simulation is the most demanding, taking up about 80% of resource use for simulation and expected to increase in the future. Solutions have been developed to cope with this demand, notably fast simulation tools based on Machine Learning (ML) techniques, which are faster than Geant4 when simulating calorimeter response and maintain a high level of accuracy. However, these ML-based models require a lot of computing resources to train.
Moreover, computational resources can also be saved by deploying their training on other resources than the CERN HTCondor batch system or the Worldwide LHC Computing Grid, with the opportunity to have an additional boost in computing performance.
In this work we introduce FastCaloGANtainer, a containerized version of FastCaloGAN, a fast simulation tool developed by the ATLAS Collaboration. FastCaloGANtainer allows the training of this tool on more powerful devices such as High Performance Computing clusters and reduces software dependencies on local or distributed file systems (such as CVMFS). We describe the testing methodology and the results obtained on different resources with different operating systems and installed software, with or without GPUs.

Significance

Fast simulation is one way to address the issue of calorimeter simulation activities being a major source of computing resource consumption and demand, but its training is still resource demanding. Containerization of fast simulation training can help reduce this burden by enabling the use of more powerful devices for these tasks. This can free up the commonly used computing resources and create more space for additional processing requests.

Experiment context, if any ATLAS

Primary authors

Federico Andrea Corchia (Universita e INFN, Bologna (IT)) Joshua Falco Beirer (Georg August Universitaet Goettingen (DE)) Joshua Falco Beirer (CERN) Lorenzo Rinaldi (Universita e INFN, Bologna (IT)) Michele Faucci Giannelli (INFN e Universita Roma Tor Vergata (IT)) Rui Zhang (University of Wisconsin Madison (US))

Presentation materials