EP-IT Data Science Seminars

CMS ML-based Particle Flow reconstruction

by Farouk Mokhtar (Univ. of California San Diego (US))

Europe/Zurich
40/S2-B01 - Salle Bohr (CERN)

40/S2-B01 - Salle Bohr

CERN

100
Show room on map
Description

The particle-flow (PF) algorithm reconstructs a global description of each collision by producing a comprehensive list of final-state particles. It is central to event reconstruction in the CMS experiment at the CERN LHC, and has been a focus of developments in light of planned high-luminosity running conditions with increased pileup and detector granularity. The existing PF implementation relies on several physics-motivated heuristics and assumptions that can be replaced by machine learning (ML) models trained directly on simulated data. To address the increasing complexity of collision events and exploit modern graphics processing units (GPUs), ML-based alternatives to the traditional modular reconstruction chain are investigated. A state-of-the-art, ML-based PF (MLPF) reconstruction algorithm implemented within the CMS software framework is presented. The MLPF algorithm performs a learnable, differentiable full-event reconstruction on GPUs, generalizes across detector conditions and collision energies, and replaces multiple traditional reconstruction steps with a single unified model. Physics performance comparable to standard PF reconstruction is achieved in both simulation and data, with improved jet energy resolution and inference time. In simulated top quark-antiquark events under LHC Run 3 (2023–2024) conditions, the jet energy resolution is improved by 10–20% for jets with transverse momentum between 30 and 100 GeV. Runtime performance is evaluated using simulated QCD multijet events with pileup corresponding to 55–75 interactions. For these events, a median runtime of 20 ms per event is achieved on an Nvidia L4 GPU, with better scaling with event size than the standard CMS particle-flow reconstruction, which processes the same events in approximately 110 ms per event.



Bio:

Farouk Mokhtar is a CMS physicist working at the intersection of experimental high-energy physics and machine learning. His research focuses on modernizing event reconstruction at the LHC using advanced deep-learning architectures. Within the CMS Collaboration, he is a leading contributor to the development of the ML-based particle-flow (MLPF) algorithm, which reformulates full event reconstruction as a global set-to-set learning problem using transformer models. He currently serves as the ML contact person for the CMS Particle-Flow group and is the primary contact and lead author of the upcoming CMS paper on MLPF, expected to be released in February 2026. In addition to employing AI methods for reconstruction, he has also contributed to the CMS boosted Higgs program, leading the Lorentz-boosted HWW analysis, where he has applied novel ML-based jet taggers to improve the signal sensitivity in the challenging high-pT regime. His broader research interests include ML-driven detector design optimization and the application of modern AI architectures to future collider experiments.

 

Organised by

M. Girone, M. Elsing, L. Moneta, M. Pierini

Zoom Meeting ID
98545267593
Description
EP/IT Data Science seminar
Host
Lorenzo Moneta
Alternative hosts
Pascal Pignereau, Maria Girone, Thomas Nik Bazl Fard, Caroline Cazenoves, EP Seminars and Colloquia, Markus Elsing, Maurizio Pierini
Passcode
97200142
Useful links
Join via phone
Zoom URL