Speaker
Description
Calorimeter simulation is among the most resource-hungry components of modern collider experiments such as ATLAS and CMS, currently accounting for half of the total CPU budgets at the LHC, and will only increase in the future High Luminosity phase. This exploding computing demand and the arrival of sizeable open datasets such as CaloChallenge have spurred the development of numerous alternatives to GEANT4 based on state-of-the-art deep learning architectures to accelerate the simulation process. Nevertheless, despite their impressive performance, neural network generators have found limited use in production, due mainly to the inaccuracy in simulating rare events.
In this contribution, we present a study to improve the performance of generative networks in modelling extreme electromagnetic calorimeter showers. We first define a number of metrics sensitive to out-of-distribution generated showers and evaluate the best models from the CaloChallenge competition. Using these metrics as our guide and a data centric approach for training and fine-tuning, we retrain the models to focus on the distribution tail without impacting their performance in the core. We propose a post-processing method based on binary classifiers to "sculpt" the generated distribution into that of the GEANT4 truth. Our study is the first to address extreme shower modelling, establishing a viable procedure to improve precision in targeted regions of the phase space, and easing the path to adopting neural network simulators in HEP experiments.