Speakers
Description
Simulation of the detector response is a major computational challenge in modern High Energy Physics experiments, as for example it accounts for about two fifths of the total ATLAS computing resources. Among simulation tasks, calorimeter simulation is the most demanding, taking up about 80% of resource use for simulation and expected to increase in the future. Solutions have been developed to cope with this demand, notably fast simulation tools based on Machine Learning (ML) techniques, which are faster than Geant4 when simulating calorimeter response and maintain a high level of accuracy. However, these ML-based models require a lot of computing resources to train.
Moreover, computational resources can also be saved by deploying their training on other resources than the CERN HTCondor batch system or the Worldwide LHC Computing Grid, with the opportunity to have an additional boost in computing performance.
In this work we introduce FastCaloGANtainer, a containerized version of FastCaloGAN, a fast simulation tool developed by the ATLAS Collaboration. FastCaloGANtainer allows the training of this tool on more powerful devices such as High Performance Computing clusters and reduces software dependencies on local or distributed file systems (such as CVMFS). We describe the testing methodology and the results obtained on different resources with different operating systems and installed software, with or without GPUs.
Significance
Fast simulation is one way to address the issue of calorimeter simulation activities being a major source of computing resource consumption and demand, but its training is still resource demanding. Containerization of fast simulation training can help reduce this burden by enabling the use of more powerful devices for these tasks. This can free up the commonly used computing resources and create more space for additional processing requests.
Experiment context, if any | ATLAS |
---|