CMS ML-based Particle Flow reconstruction
by
The particle-flow (PF) algorithm reconstructs a global description of each collision by producing a comprehensive list of final-state particles. It is central to event reconstruction in the CMS experiment at the CERN LHC, and has been a focus of developments in light of planned high-luminosity running conditions with increased pileup and detector granularity. The existing PF implementation relies on several physics-motivated heuristics and assumptions that can be replaced by machine learning (ML) models trained directly on simulated data. To address the increasing complexity of collision events and exploit modern graphics processing units (GPUs), ML-based alternatives to the traditional modular reconstruction chain are investigated. A state-of-the-art, ML-based PF (MLPF) reconstruction algorithm implemented within the CMS software framework is presented. The MLPF algorithm performs a learnable, differentiable full-event reconstruction on GPUs, generalizes across detector conditions and collision energies, and replaces multiple traditional reconstruction steps with a single unified model. Physics performance comparable to standard PF reconstruction is achieved in both simulation and data, with improved jet energy resolution and inference time. In simulated top quark-antiquark events under LHC Run 3 (2023–2024) conditions, the jet energy resolution is improved by 10–20% for jets with transverse momentum between 30 and 100 GeV. Runtime performance is evaluated using simulated QCD multijet events with pileup corresponding to 55–75 interactions. For these events, a median runtime of 20 ms per event is achieved on an Nvidia L4 GPU, with better scaling with event size than the standard CMS particle-flow reconstruction, which processes the same events in approximately 110 ms per event.
Bio:
Farouk Mokhtar is a CMS physicist working at the intersection of experimental high-energy physics and machine learning. His research focuses on modernizing event reconstruction at the LHC using advanced deep-learning architectures. Within the CMS Collaboration, he is a leading contributor to the development of the ML-based particle-flow (MLPF) algorithm, which reformulates full event reconstruction as a global set-to-set learning problem using transformer models. He currently serves as the ML contact person for the CMS Particle-Flow group and is the primary contact and lead author of the upcoming CMS paper on MLPF, expected to be released in February 2026. In addition to employing AI methods for reconstruction, he has also contributed to the CMS boosted Higgs program, leading the Lorentz-boosted HWW analysis, where he has applied novel ML-based jet taggers to improve the signal sensitivity in the challenging high-pT regime. His broader research interests include ML-driven detector design optimization and the application of modern AI architectures to future collider experiments.
M. Girone, M. Elsing, L. Moneta, M. Pierini