15–19 Sept 2025
CERN
Europe/Zurich timezone

Integration and validation of ML-based particle flow (MLPF) in Phase-2 TICL reconstruction

15 Sept 2025, 10:05
5m
500/1-001 - Main Auditorium (CERN)

500/1-001 - Main Auditorium

CERN

400
Show room on map
1. Cutting Edge AI for Offline Data Processing Cutting Edge AI for Offline Data Processing

Speakers

Farouk Mokhtar (Univ. of California San Diego (US)) Joosep Pata (National Institute of Chemical Physics and Biophysics (EE)) Marco Rovere (CERN)

Description

Given the progress and promising results on ML-based particle flow integration with CMS offline reconstruction in a Run 3 setup (CMS-PFT-25-001), we now aim to extend and integrate MLPF with Phase-2 TICL reconstruction as a plug-in. In Run 3, we demonstrated that with a small transformer model, events containing on the order of ~5’000 tracks and clusters can be reconstructed into final-state particles on a modern inference GPU (e.g. NVIDIA L4) within a few tens of milliseconds, while also improving jet performance compared to the baseline PF algorithm. For Phase-2, we will evaluate to what extent this approach can be applied in an online reconstruction setting, where the initial local reconstruction is provided by existing clustering algorithms in TICL. In analogy to the Run 3 setup, the inputs will be local clusters in detector layers (e.g. reconstructed tracksters), while the target objects will be simulation-based particles (e.g. simulated tracksters). Performance will be assessed using the standard TICL and PF metrics, namely particle-level efficiencies, fake rates and resolutions measured against simulation, as well as jet resolution, matching efficiency and fake rate, and MET resolution. Finally, since the Run 3 setup already enables pileup mitigation through an additional PU-probability output node integrated in the model, we will also explore extending this capability to Phase-2.

CERN group/ Experiment

EP-CMG

Working area Area 1" Cutting Edge AI for Offline Data Processing
If Other, please specify Area 2
Project goals - Integrate MLPF to the TICL chain for CMS Phase-2 online reconstruction as a plug-in; - Determine potential improvements to Phase-2 reconstruction physics performance (particle-level, jet-level and event-level performance) from ML-based particle flow on top of TICL tracksters; - Determine potential improvements to Phase2 reconstruction computational throughput from the ML-based, GPU-native approach.
Timeline - Y1Q1 - Q2: Develop familiarity with the TICL framework, including data exploration and event visualization, setup a first pass of the target definition for training; - Y1Q3 - Q4: Perform the first integration pass by postprocessing events into ML-ready format (e.g. parquet files), testing on small samples or particle-gun samples; - Y2Q1 - Q2: Carry out large scale training and testing on physical samples (e.g. ttbar, QCD) leveraging the same approach and code established for the Run 3 setup; - Y2Q3 - Q4: Conduct additional validation and optimization, including hyperparameter tuning and evaluation against key performance metrics (e.g. jet resolution) following the infrastructure built after iterating with JME in the Run 3 setup
Available person power 0.8 FTE
Additional person power request 1 Fellow
Is this an already ongoing activity? No
Indicative hardware resources needs Access to a GPU cluster with LCG-like software stack and cvmfs access with fast storage facilities across the full duration of the project.

Authors

Dylan Ponman (Brown University (US)) Eric Wulff (CERN) Farouk Mokhtar (Univ. of California San Diego (US)) Javier Mauricio Duarte (Univ. of California San Diego (US)) Jennifer Roloff (Brown University (US)) Joosep Pata (National Institute of Chemical Physics and Biophysics (EE)) Ka Wa Ho (Brown University (US)) Marco Rovere (CERN) Maurizio Pierini (CERN)

Presentation materials

There are no materials yet.