Conveners
Track 2: Data Analysis - Algorithms and Tools
- co-chair: Daniel Murnane
- chair: Frank Gaede
Track 2: Data Analysis - Algorithms and Tools
- chair: Davide Valsecchi
- co-chair: Daniel Murnane
Track 2: Data Analysis - Algorithms and Tools
- chair: Luisa Lucie-Smith
- co-chair: Louis Moureaux
Track 2: Data Analysis - Algorithms and Tools
- chair: Frank Gaede
- co-chair: Luisa Lucie-Smith
Track 2: Data Analysis - Algorithms and Tools
- chair: Tilman Plehn
- co-chair: Karim El Morabit
Track 2: Data Analysis - Algorithms and Tools
- co-chair: David Rousseau
- chair: Thea Aarrestad
Track 2: Data Analysis - Algorithms and Tools
- chair: Thea Aarrestad
- co-chair: Tilman Plehn
-
Jay Chan (Lawrence Berkeley National Lab. (US))08/09/2025, 14:30Track 2: Data Analysis - Algorithms and ToolsOral
Track reconstruction is a cornerstone of modern collider experiments, and the HL-LHC ITk upgrade for ATLAS poses new challenges with its increased silicon hit clusters and strict throughput requirements. Deep learning approaches compare favorably with traditional combinatorial ones — as shown by the GNN4ITk project, a geometric learning tracking pipeline that achieves competitive physics...
Go to contribution page -
Yang Zhang (Institute of High Energy Physics, Chinese Academy of Science)08/09/2025, 14:50Track 2: Data Analysis - Algorithms and ToolsOral
Precision measurements of Higgs, W, and Z bosons at future lepton colliders demand jet energy reconstruction with unprecedented accuracy. The particle flow approach has proven to be an effective method for achieving the required jet energy resolution. We present CyberPFA, a particle flow algorithm specifically optimized for the particle-flow-oriented crystal bar electromagnetic calorimeter...
Go to contribution page -
Cilicia Uzziel Perez (La Salle, Ramon Llull University (ES))08/09/2025, 15:10Track 2: Data Analysis - Algorithms and ToolsOral
We present lightweight, attention-enhanced Graph Neural Networks (GNNs) tailored for real-time particle reconstruction and identification in LHCb’s next-generation calorimeter. Our architecture builds on node-centric GarNet layers, which eliminate costly edge message passing and are optimized for FPGA deployment, achieving sub-microsecond inference latency. By integrating attention mechanisms...
Go to contribution page -
Katharina sophia Schaeuble, Ulrich Einhaus (KIT - Karlsruhe Institute of Technology (DE))08/09/2025, 15:30Track 2: Data Analysis - Algorithms and ToolsOral
We present a versatile GNN-based end-to-end reconstruction algorithm for highly granular calorimeters that can include track and timing information to aid the reconstruction of particles. The algorithm starts directly from calorimeter hits and possibly reconstructed tracks, and outputs a coordinate transformation in which all shower objects are well separated from each other and assigned...
Go to contribution page -
Enrico Lupi (CERN, INFN Padova (IT))08/09/2025, 15:50Track 2: Data Analysis - Algorithms and ToolsOral
With the upcoming High-Luminosity upgrades at the LHC, data generation rates are expected to increase significantly. This calls for highly efficient architectures for machine learning inference in experimental workflows like event reconstruction, simulation, and data analysis.
Go to contribution page
At the ML4EP team at CERN, we have developed SOFIE, a tool within the ROOT/TMVA package that translates externally... -
Aishik Ghosh (University of California Irvine (US))08/09/2025, 16:40Track 2: Data Analysis - Algorithms and ToolsOral
Particle physics experiments rely on the (generalised) likelihood ratio test (LRT) for searches and measurements. This is not guaranteed to be optimal for composite hypothesis tests, as the Neyman-Pearson lemma pertains only to simple hypothesis tests. An improvement in the core statistical testing methodology would have widespread ramifications across experiments. We discuss an alternate test...
Go to contribution page -
R D Schaffer (Université Paris-Saclay (FR))08/09/2025, 17:00Track 2: Data Analysis - Algorithms and ToolsOral
Neural Simulation-Based Inference (NSBI) is a powerful class of machine learning (ML)-based methods for statistical inference that naturally handle high dimensional parameter estimation without the need to bin data into low-dimensional summary histograms. Such methods are promising for a range of measurements at the Large Hadron Collider, where no single observable may be optimal to scan over...
Go to contribution page -
Marian I Ivanov (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE))08/09/2025, 17:20Track 2: Data Analysis - Algorithms and ToolsOral
We present a modular, data-driven framework for calibration and performance correction in the ALICE experiment. The method addresses time- and parameter-dependent effects in high-occupancy heavy-ion environments, where evolving detector conditions (e.g., occupancy and cluster overlaps, gain drift, space charge, dynamic distortions, and reconstruction or calibration deficiencies) require...
Go to contribution page -
Hongyue Duyang (Shandong University)08/09/2025, 17:40Track 2: Data Analysis - Algorithms and ToolsOral
Jiangmen Underground Neutrino Observatory (JUNO) is a next generation 20-kton liquid scintillator detector under construction in southern China. It is designed to determine neutrino mass ordering via the measurement of reactor neutrino oscillation, and also to study other physics topics including atmospheric neutrinos, supernova neutrinos and more. The detector's large mass and high...
Go to contribution page -
Laurits Tani (National Institute of Chemical Physics and Biophysics (EE))09/09/2025, 14:30Track 2: Data Analysis - Algorithms and ToolsOral
The application of foundation models in high-energy physics has recently been proposed as a way to use large unlabeled datasets to efficiently train powerful task-specific models. The aim is to train a task-agnostic model on an existing large dataset such that the learned representation can later be utilized for subsequent downstream physics tasks.
Go to contribution page
The pretrained model can reduce the training... -
Anna Hallin (University of Hamburg)09/09/2025, 14:50Track 2: Data Analysis - Algorithms and ToolsOral
OmniJet-alpha, the first cross-task foundation model for particle physics, was first presented at ACAT 2024. In its base configuration, OmniJet-alpha is capable of transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging). Since its release, we have also shown that it can sucessfully transfer from CMS Open data to simulation, and even...
Go to contribution page -
Lea Reuter (Karlsruhe Institute of Technology)09/09/2025, 15:10Track 2: Data Analysis - Algorithms and ToolsOral
Large backgrounds and detector aging impact the track finding in the Belle II central drift chamber, reducing both purity and efficiency in events. This necessitates the development of new track algorithms to mitigate detector performance degradation. Building on our previous success with an end-to-end multi-track reconstruction algorithm for the Belle II experiment at the SuperKEKB collider...
Go to contribution page -
CMS Collaboration, Filippo Cattafesta (Scuola Normale Superiore & INFN Pisa (IT))09/09/2025, 15:30Track 2: Data Analysis - Algorithms and ToolsOral
Detailed event simulation at the LHC is taking a large fraction of computing budget. CMS developed an end-to-end ML based simulation that can speed up the time for production of analysis samples of several orders of magnitude with a limited loss of accuracy. As the CMS experiment is adopting a common analysis level format, the NANOAOD, for a larger number of analyses, such an event...
Go to contribution page -
CMS Collaboration, Dr Florian Bury (University of Bristol)09/09/2025, 15:50Track 2: Data Analysis - Algorithms and ToolsOral
The Matrix Element Method (MEM) offers optimal statistical power for hypothesis testing in particle physics, but its application is hindered by the computationally intensive multi-dimensional integrals required to model detector effects. We present a novel approach that addresses this challenge by employing Transformers and generative machine learning (ML) models. Specifically, we utilize ML...
Go to contribution page -
Quentin Führing (Technische Universitaet Dortmund (DE), University of Cambridge (UK))09/09/2025, 16:40Track 2: Data Analysis - Algorithms and ToolsOral
Measurements of neutral, oscillating mesons are a gateway to quantum mechanics and give access to the fundamental interactions of elementary particles. For example, precise measurements of $CP$ violation in neutral $B$ mesons can be taken in order to test the Standard Model of particle physics. These measurements require knowledge of the $B$-meson flavour at the time of its production, which...
Go to contribution page -
Sebastian Pitz (ITP, Heidelberg University)09/09/2025, 17:00Track 2: Data Analysis - Algorithms and ToolsOral
We construct Lorentz-equivariant transformer and graph networks using the concept of local canonicalization. While many Lorentz-equivariant architectures use specialized layers, this approach allows to take any existing non-equivariant architecture and make it Lorentz-equivariant using transformations with equivariantly predicted local frames. In addition, data augmentation emerges as a...
Go to contribution page -
Gage DeZoort (Princeton University (US))09/09/2025, 17:20Track 2: Data Analysis - Algorithms and ToolsOral
Modern machine learning (ML) algorithms are sensitive to the specification of non-trainable parameters called hyperparameters (e.g., learning rate or weight decay). Without guiding principles, hyperparameter optimization is the computationally expensive process of sweeping over various model sizes and, at each, re-training the model over a grid of hyperparameter settings. However, recent...
Go to contribution page -
Samuele Grossi (Università degli studi di Genova & INFN sezione di Genova)09/09/2025, 17:40Track 2: Data Analysis - Algorithms and ToolsOral
Deep generative models have become powerful tools for alleviating the computational burden of traditional Monte Carlo generators in producing high-dimensional synthetic data. However, validating these models remains challenging, especially in scientific domains requiring high precision, such as particle physics. Two-sample hypothesis testing offers a principled framework to address this task....
Go to contribution page -
Gagik Gavalian (Jefferson National Lab)10/09/2025, 11:30Track 2: Data Analysis - Algorithms and ToolsOral
Charged track reconstruction is a critical task in nuclear physics experiments, enabling the identification and analysis of particles produced in high-energy collisions. Machine learning (ML) has emerged as a powerful tool for this purpose, addressing the challenges posed by complex detector geometries, high event multiplicities, and noisy data. Traditional methods rely on pattern recognition...
Go to contribution page -
Francesco Fenu (Agenzia Spaziale Italiana)10/09/2025, 11:50Track 2: Data Analysis - Algorithms and ToolsOral
The Compton Spectrometer and Imager (COSI) is a NASA Small Explorer (SMEX) satellite mission planned to fly in 2027. It has the participation of institutions in the US, Europe and Asia and aims at the construction of a gamma-ray telescope for observations in the 0.2-5 MeV energy range. COSI consists of an array of germanium strip detectors cooled to cryogenic temperatures with millimeter...
Go to contribution page -
Dr Marcel Völschow (Hamburg University of Applied Sciences)10/09/2025, 12:10Track 2: Data Analysis - Algorithms and ToolsOral
Beyond the planet Neptune, only the largest solar system objects can be observed directly. However, there are tens of thousands of smaller objects whose frequency and distribution could provide valuable insights into the formation of our solar system - if we could see them.
Project SOWA (Solar-system Occultation Watch and Analysis) aims to systematically search for such invisible objects...
Go to contribution page -
Julian Simon Schliwinski (Humboldt University of Berlin (DE))10/09/2025, 12:30Track 2: Data Analysis - Algorithms and ToolsOral
The next generation of ground-based gamma-ray astronomy instruments will involve arrays of dozens of telescopes, leading to an increase in operational and analytical complexity. This scale-up poses challenges for both system operations and offline data processing, especially when conventional approaches struggle to scale effectively. To address these challenges, we are developing AI agents...
Go to contribution page -
Stephen Jiggins (Deutsches Elektronen-Synchrotron (DE))10/09/2025, 12:50Track 2: Data Analysis - Algorithms and ToolsOral
In many domains of science the likelihood function is a fundamental ingredient used to statistically infer model parameters from data, due to the likelihood ratio (LR) as an optimal test statistic. Neural based LR estimation using probabilistic classification has therefore had a significant impact in these domains, providing a scalable method for determining an intractable LR from simulated...
Go to contribution page -
Simon Hiesl11/09/2025, 14:30Track 2: Data Analysis - Algorithms and ToolsOral
In anticipation of higher luminosities at the Belle II experiment, high levels of beam background
Go to contribution page
from outside of the interaction region are expected. To prevent track trigger rates
from surpassing the limitations of the data acquisition system, an upgrade of the first-level
neural track trigger becomes indispensable. This upgrade contains a novel track finding
algorithm based on... -
Lennart Uecker (Heidelberg University (DE))11/09/2025, 14:50Track 2: Data Analysis - Algorithms and ToolsOral
The LHCb experiment at the Large Hadron Collider (LHC) operates a fully software-based trigger system that processes proton-proton collisions at a rate of 30 MHz, reconstructing both charged and neutral particles in real time. The first stage of this trigger system, running on approximately 500 GPU cards, performs a track pattern recognition to reconstruct particle trajectories with low...
Go to contribution page -
Giulio Cordova (Universita & INFN Pisa (IT))11/09/2025, 15:10Track 2: Data Analysis - Algorithms and ToolsOral
The upgraded LHCb experiment is pioneering the landscape of real-time data-processing techniques using an heterogeneous computing infrastructure, composed of both GPUs and FPGAs, aimed at boosting the performance of the HLT1 reconstruction. Amongst the novelties in the reconstruction infrastructure made for the Run 3, the introduction of a real-time VELO hit-finding FPGA-based architecture...
Go to contribution page -
CMS Collaboration, Mario Masciovecchio (Univ. of California San Diego (US))11/09/2025, 15:30Track 2: Data Analysis - Algorithms and ToolsOral
Charged particle track reconstruction is one the heaviest computational tasks in the event reconstruction chain at Large Hadron Collider (LHC) experiments. Furthermore, projections for the High Luminosity LHC (HL-LHC) show that the required computing resources for single-threaded CPU algorithms will exceed those that are expected to be available. It follows that experiments at the HL-LHC will...
Go to contribution page -
Qi Bin Lei (Stanford University (US)), Rocky Bala Garg (Stanford University (US))11/09/2025, 15:50Track 2: Data Analysis - Algorithms and ToolsOral
The exponential time scaling of traditional primary vertex reconstruction algorithms raises significant performance concerns for future high-pileup environments, particularly with the upcoming High Luminosity upgrade to the Large Hadron Collider. In this talk, we introduce PV-Finder, a deep learning-based approach that leverages reconstructed track parameters to directly predict primary vertex...
Go to contribution page -
Antoine Petitjean (ITP, Universität Heidelberg)11/09/2025, 16:40Track 2: Data Analysis - Algorithms and ToolsOral
Unfolding detector-level data into meaningful particle-level distributions remains a key challenge in collider physics, especially as the dimensionality of the relevant observables increases. Traditional unfolding techniques often struggle with such high-dimensional problems, motivating the development of machine learning-based approaches.We introduce a new method for generative unfolding that...
Go to contribution page -
Sofia Palacios Schweitzer (ITP, University Heidelberg)11/09/2025, 17:00Track 2: Data Analysis - Algorithms and ToolsOral
Two shortcomings of classical unfolding algorithms, namely that they are defined on binned, one-dimensional observables, can be overcome when using generative machine learning. Many studies on generative unfolding reduce the problem to correcting for detector smearing, however a full unfolding pipeline must also account for background, acceptance and efficiency effects. To fully integrate...
Go to contribution page -
Dr Mirko Bunse (Lamarr Institute for Machine Learning and Artificial Intelligence, Dortmund, Germany)11/09/2025, 17:20Track 2: Data Analysis - Algorithms and ToolsOral
Measured distributions are usually distorted by a finite resolution of the detector. Within physics research, the necessary correction of these distortions is know as Unfolding. Machine learning research uses a different term for this very task: Quantification Learning. For the past two decades, this difference in terminology - together with several differences in notation - have prevented...
Go to contribution page -
David Walter (Massachusetts Inst. of Technology (US))11/09/2025, 17:40Track 2: Data Analysis - Algorithms and ToolsOral
The High-Luminosity LHC era will deliver unprecedented data volumes, enabling measurements on fine-grained multidimensional histograms containing millions of bins with thousands of events each. Achieving ultimate precision requires modeling thousands of systematic uncertainty sources, creating computational challenges for likelihood minimization and parameter extraction. Fast minimization is...
Go to contribution page