Speaker
Description
With the recent release of the OpenDataDetector High-Luminosity Physics Benchmark Dataset (aka ColliderML), it is now feasible to study the behaviour of machine learning (ML) algorithms in a variety of full simulation conditions. We present a suite of pilot studies that establish the utility of such a multi-scale, full-detector dataset for ML development. First, we examine the ability of models to distinguish between different physics channels from low- and/or high-level objects, and to harness multiple detector regions in reconstruction (for example combining tracker hits and calorimeter clusters to perform track reconstruction and particle flow on low-level readout). Second, we study generalizability to unseen SM and BSM physics conditions. Third, we test whether symmetry-preserving architectures (e.g. Lorentz equivariance) continue to work well in cases where symmetry may be broken by the detector effects of digitized full simulation. Finally, we broadly study the potential of this dataset as a platform for building true particle physics foundation models: systems that can directly consume low level data and be fine-tuned to a variety of downstream reconstruction and analysis tasks.