7th Inter-Experimental LHC Machine Learning Workshop
This is the seventh annual workshop of the LPCC inter-experimental machine learning working group.
The workshop will be held on 19May-23May 2025 at CERN in a hybrid format, with remote participation made possible.
Confirmed invited speakers
-
Aishik Ghosh (UC Irvine)
-
Patrick Kidger (Cradle.bio)
-
Claudius Krause (Vienna, ÖAW)
-
Ilaria Luise (ECMWF)
- Antonin Raffin (German Aerospace Center)
-
Enrique Rico Ortega (CERN)
- Ricardo Rocha (CERN)
-
Eric Wulff (CERN)
Warning
If you receive any email by a "Global Travel Experts" company or any other similar company requesting your itinerary or other personal information or promising accommodation, please be aware it is a scam, and please report it to the CERN IT department https://information-technology.web.cern.ch/.
Workshop format
Following the success of the last edition format, the structure of the workshop in terms of contributions is focused on poster presentations. The reason behind this approach is to be able to allocate a large number of contributions while promoting a strong interaction between the presenters/participants. For this reasons, we require the poster presenters to attend in person. A small number of contributed submissions will be selected for oral presentations.
You will have to arrange for your own accommodation, either in the CERN Hostel (https://edh.cern.ch/Hostel/, subject to room availability) or in nearby hotels.
Please make sure to be registered to lhc-machinelearning-wg@cern.ch CERN egroup, to be informed of any unforeseen circumstance.
The preliminary structure of the workshop includes:
- Tutorials
- Plenary invited talks from academy
- Plenary invited talks from industry
- Poster sessions
- Plenary contributed talks
A satellite workshop Fair Universe HiggsML Uncertainty CERN workshop https://indico.cern.ch/event/1523250/ is organised on Monday morning.
Workshop tracks
For the contributed posters and potential talks, the following Tracks have been defined:
- ML for object identification and reconstruction
- ML for analysis: Event classification, statistical analysis and inference, anomaly detection
- ML for simulation and surrogate model: Application of ML for simulation or cases of replacing an existing complex model
- LLMs and foundation models
- Fast ML: Application of ML to DAQ/Trigger/Real Time Analysis/Edge Computing
- ML infrastructure: Hardware and software for ML/MLOps
- ML training, courses, tutorials, open datasets and challenges
- ML in astroparticle physics
- ML in phenomenology and theory
- ML for particle accelerators
- Other
IML group
This workshop is organized by the CERN IML coordinators. To keep up to date with ML at LHC, please register to lhc-machinelearning-wg@cern.ch CERN egroup.
Sponsor
The SMARTHEP European Training Network is sponsoring the apero on Thursday night, coinciding with the visit of Confindustria Piedmont (https://indico.cern.ch/event/1518429/). SMARTHEP acknowledges funding from the European Union Horizon 2020 research and innovation programme, call H2020-MSCA-ITN-2020, under Grant Agreement n. 956086
-
-
Other: Fair Universe HiggsML Uncertainty CERN workshop [https://indico.cern.ch/event/1523250/] 222/R-001
-
1
Speaker: Matthias Komm (Deutsches Elektronen-Synchrotron (DE))
-
-
2
Weather forecasting with foundational models
Ilaria Luise is working as Machine Learning scientist at the European Center for Medium Range Weather Forecast.
Before moving to Machine Learning for Weather and Climate she worked in ATLAS for many years on VH–>bb measurements and b-tagging. She changed career path in 2023, but still staying at CERN, thanks to the CERN program for environmental applications (CIPEA) and the CERN Knowledge Transfer fund. She has been CERN fellow and co-PI of the EMP2 project at OpenLab, to build a foundation model for Weather and Climate as a collaboration between CERN, the University of Magdeburg and the Juelich Supercomputing Center.Speaker: Ilaria Luise (ECMWF) -
3
Neural differential equations
Patrick Kidger works across three distinct disciplines of scientific machine learning: open source software, neural differential equations, and ML for protein engineering. He is the author for much of the open-source scientific JAX ecosystem, holds a visiting lectureship at Imperial College London, and leads much of the ML-for-protein-design at Cradle.bio. He was previously an ML researcher at Google X, and received his PhD from Oxford on neural differential equations.
Speaker: Patrick Kidger (Cradle.bio)
-
2
-
16:05
-
-
4
Neural Simulation-Based Inference for HEP
Aishik Ghosh is an incoming professor of AI and physics at Georgia Institute of Technology and currently a postdoctoral scholar at UC Irvine and an affiliate at Berkeley Lab. His focus is on designing high-dimensional statistical methods including uncertainty quantification tools for reliable applications across particle and astrophysics. He also applies AI tools for theoretical physics model building. Beyond research, Dr. Ghosh has worked with policy organisations such as the Organisation for Economic Co-operation and Development on AI and science policy. He obtained his PhD in particle physics from the University of Paris-Saclay.
Speaker: Aishik Ghosh (University of California Irvine (US))
-
4
-
-
-
-
5
AI & HPC
Eric Wulff is a data scientist and machine learning engineer in the Frontier Technologies and Initiatives group in the CERN IT department. He specializes in developing and applying AI models to address complex scientific and technical challenges, leveraging large-scale high-performance computing (HPC) systems for distributed training and hyperparameter optimization. As part of CERN openlab’s management team, he provides strategic guidance on the convergence of AI and HPC in openlab R&D projects.
Eric holds an MSc in Engineering Physics from Lund University. Before joining CERN, he worked as a machine learning engineer, focusing on real-time object detection and video analytics using deep learning on edge processors.Speaker: Eric Wulff (CERN)
-
5
-
10:00
-
-
6
Cross-Geometry Fast Electromagnetic Shower Simulation
The accurate simulation of particle showers in collider detectors remains a critical bottleneck for high-energy physics research. Current approaches face fundamental limitations in scalability when modeling the complete shower development process.
Deep generative models offer a promising alternative, potentially reducing simulation costs by orders of magnitude. This capability becomes increasingly vital as upcoming particle physics experiments are expected to produce unprecedented volumes of data.We present a novel domain adaptation framework employing state-of-the-art deep generative models to generate high-fidelity point-cloud representations of electromagnetic particle showers.
Using transfer learning techniques, our approach adapts simulations across diverse electromagnetic calorimeter geometries with exceptional data efficiency, thereby reducing training requirements and eliminating the need for a fixed-grid structure.The results demonstrate that our method can achieve high accuracy while significantly reducing data and computational demands, offering a scalable solution for next-generation particle physics simulations.
Speaker: Lorenzo Valente (University of Hamburg) -
7
CMS FlashSim: how an end-to-end ML approach speeds up simulation in CMS
Detailed event simulation at the LHC is taking a large fraction of computing budget. CMS developed an end-to-end ML based simulation framework, called FlashSim, that can speed up the time for production of analysis samples of several orders of magnitude with a limited loss of accuracy. We show how this approach achieves a high degree of accuracy, not just on basic kinematics but on the complex and highly correlated physical and tagging variables included in the CMS common analysis-level format, the NANOAOD. We prove that this approach can generalize to processes not seen during training. Furthermore, we discuss and propose solutions to address the simulation of objects coming from multiple physical sources or originating from pileup. Finally, we present a comparison with full simulation samples for some simplified analysis benchmarks, as well as how we can use the CMS Remote Analysis Builder (CRAB) to submit simulation of large samples to the LHC Computing Grid. The simulation takes as input relevant generator-level information, e.g. from PYTHIA, while outputs are directly produced in the NANOAOD format. The underlying models being used are state-of-the-art continuous FLows, trained through Flow Matching.
With this work, we aim to demonstrate that this end-to-end approach to simulation is capable of meeting experimental demands, both in the short term and in view of HL-LHC; and update the LHC community about recent developments.
Speaker: Francesco Vaselli (Scuola Normale Superiore & INFN Pisa (IT)) -
8
End-to-End Optimal Detector Design with Mutual Information Surrogates
We introduce a novel approach for end-to-end black-box optimization of high energy physics (HEP) detectors using local deep learning (DL) surrogates. These surrogates approximate a scalar objective function that encapsulates the complex interplay of particle-matter interactions and physics analysis goals. In addition to a standard reconstruction-based metric commonly used in the field, we investigate the information-theoretic metric of mutual information. Unlike traditional methods, mutual information is inherently task-agnostic, offering a broader optimization paradigm that is less constrained by predefined targets.
We demonstrate the effectiveness of our method in a realistic physics analysis scenario: optimizing the thicknesses of calorimeter detector layers based on simulated particle interactions. The surrogate model learns to approximate objective gradients, enabling efficient optimization with respect to energy resolution.
Our findings reveal three key insights: (1) end-to-end black-box optimization using local surrogates is a practical and compelling approach for detector design, providing direct optimization of detector parameters in alignment with physics analysis goals; (2) mutual information-based optimization yields design choices that closely match those from state-of-the-art physics-informed methods, indicating that these approaches operate near optimality and reinforcing their reliability in HEP detector design; and (3) information-theoretic methods provide a powerful, generalizable framework for optimizing scientific instruments. By reframing the optimization process through an information-theoretic lens rather than domain-specific heuristics, mutual information enables the exploration of new avenues for discovery beyond conventional approaches.
Pre-print can be found on arxiv: 2503.14342
Speaker: Stephen Mulligan (Universite de Geneve (CH)) -
9
DDFastShowerML: A Library for ML-based Fast Calorimeter Shower Simulation at Future Collider Experiments and Beyond
Given the intense computational demands of full simulation approaches based on traditional Monte Carlo methods, recent fast simulation approaches for calorimeter showers based on deep generative models have received significant attention.
However, for these models to be used in production it is essential for them to be integrated within the existing software ecosystems of experiments. This additionally allows the full reconstruction chain of an experiment to be used as a benchmark of the performance of a model. Such a development therefore provides access to a new suite of physics-based metrics, which ultimately determine a model’s suitability as a fast simulation tool.
In this contribution we describe DDFastShowerML, a library now available in Key4hep. This generic library provides a means of combining inference of generative models trained to simulate calorimeter showers with the DD4hep toolkit, by making use of the fast simulation hooks that exist in Geant4. This makes it possible to simulate showers in realistically detailed detector geometries, such as those proposed for use at future colliders and for community challenges, while seamlessly combining full and fast simulation. Examples will be given of numerous models that have been integrated, as well as the various detector geometries that have been studied, highlighting the flexibility of the library. A summary of future development directions will also be given.
Speaker: Peter McKeown (CERN) -
10
Why so Negative? Neural Quasiprobabilistic Likelihood Ratio Estimation with Negatively Weighted Data
In many domains of science, the likelihood ratio function (LR) is a fundamental ingredient for a variety of statistical methods such as inference, importance sampling, and classification. Neural based LR estimation using probabilistic classification has therefore had a significant impact in these domains, providing a scalable method for determining an intractable LR from simulated datasets via the so-called ratio trick. Traditional machine learning approaches rely on the assumption that the underlying probability distribution is nonnegative, but in quantum mechanical systems it's possible to encounter events with negative probabilities. In high energy physics this is a significant problem when simulating proton-proton (pp) collisions using quantum field theory, due to the fact that Monte Carlo simulation codes can introduce negatively weighted data.
Two problems present themselves when training a neural likelihood ratio estimator with negatively weighted data. First, the variance of the mini-batch losses used during neural network parameter updates are systematically increased, thereby hindering the convergence of stochastic gradient descent (SGD) algorithms. The second is that most classification and density (ratio) estimation loss functions constrain the neural LR estimates to be in the range $[0,\infty)$. Therefore, should negative densities prevail anywhere within the measurable space, the neural network would be incapable of expressing such behavior.
This work will demonstrate two important advancements for LR estimation with negatively weighted data. First, a new loss function for binary classification is introduced to extend the neural based LR trick to be compatible with quasiprobabilistic distributions. Second, signed probability spaces are used to decompose the likelihoods into signed mixture models. This decomposition reduces the overall LR estimation task into four nonnegative LR estimation sub-tasks, each with reduced loss variance during optimization relative to the overall task. Each nonnegative LR is estimated using a calibrated neural discriminative classifier, which are then combined via coefficients that are optionally optimized using the new loss function. The technique is demonstrated using di-Higgs production via gluon-gluon fusion in pp collisions at the Large Hadron Collider.
Speaker: Matthew Drnevich (New York University (US))
-
6
-
12:30
-
-
11
Optimisation of the Accelerator Control by Reinforcement Learning: A Simulation-Based Approach
Abstract
Optimizing control systems in particle accelerators presents significant challenges, often requiring extensive manual effort and expert knowledge. Traditional tuning methods are time-consuming and may struggle to navigate the complexity of modern beamline architectures. To address these challenges, we introduce a simulation-based framework that leverages Reinforcement Learning (RL) [1] to enhance the control and optimization of beam transport systems. Built on top of the Elegant simulation engine [2], our Python-based platform automates the generation of simulations and transforms accelerator tuning tasks into RL environments with minimal user intervention. The framework features a modified Soft Actor-Critic (SAC) agent [3] enhanced with curriculum learning techniques [4], enabling robust performance across a variety of beamline configurations. Designed with accessibility and flexibility in mind, the system can be deployed by non-experts and adapted to optimize virtually any beamline. Early results demonstrate successful application across multiple simulated beamlines, validating the approach and offering promising potential for broader adoption. We continue to refine the framework toward a general-purpose solution—one that can serve both as an intelligent co-pilot for physicists and a testbed for RL researchers developing new algorithms. This work highlights the growing synergy between AI and accelerator physics [1, 3], and the critical role of computational innovation [2] in advancing experimental capabilities.
References
-
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press. http://incompleteideas.net/book/the-book-2nd.html
-
Borland, M. (2000). elegant: A flexible SDDS-compliant code for accelerator simulation. 6th International Computational Accelerator Physics Conference (ICAP 2000). https://doi.org/10.2172/761286
-
Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the 35th International Conference on Machine Learning (pp. 1861–1870). PMLR.
-
Narvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor, M. E., & Stone, P. (2020). Curriculum learning for reinforcement learning domains: A framework and survey. Journal of Machine Learning Research, 21(181), 1–50. https://www.jmlr.org/papers/volume21/20-212/20-212.pdf
Speaker: Anwar Ibrahim -
-
12
Physics Instrument Design with Reinforcement Learning
We present a case for the use of Reinforcement Learning (RL) for the design of physics instruments as an alternative to gradient-based instrument-optimization methods in arXiv:2412.10237. Its applicability is demonstrated using two empirical studies. One is longitudinal segmentation of calorimeters and the second is both transverse segmentation as well as longitudinal placement of trackers in a spectrometer. Based on these experiments, we propose an alternative approach that offers unique advantages over differentiable programming and surrogate-based differentiable design optimization methods. First, RL algorithms possess inherent exploratory capabilities, which help mitigate the risk of convergence to local optima. Second, this approach eliminates the necessity of constraining the design to a predefined detector model with fixed parameters. Instead, it allows for the flexible placement of a variable number of detector components and facilitates discrete decision-making. We then discuss the road map of how this idea can be extended into designing very complex instruments. The presented study sets the stage for a novel framework in physics instrument design, offering a scalable and efficient framework that can be pivotal for future projects such as the Future Circular Collider (FCC), where highly optimized detectors are essential for exploring physics at unprecedented energy scales.
Speaker: Shah Rukh Qasim (University of Zurich (CH)) -
13
Convolutional pile-up suppression in the ATLAS Global Trigger
We describe a PU-suppression algorithm for the Global trigger using convolutional neural networks. The network operates on cell towers, exploiting both cluster topology and $E_T$ to correct for the contribution of PU. The algorithm is optimised for firmware deployment, demonstrating high throughput and low resource usage. The small size of the input and lightweight implementation enable a high degree of scalability and parallelisation. We benchmark the physics performance of our algorithm by reconstructing and calibrating small-$R$ central jets, and comparing to a range of existing algorithms. Trigger rates and thresholds are estimated, with the CNN producing the lowest thresholds for central multi-jet, jet $H_T$ and $E_T^\text{miss}$ triggers. We apply these thresholds to an SM VBF $HH\rightarrow b\bar{b}b\bar{b}$ sample and find that the highest acceptance is obtained using our algorithm.
Speaker: Noah Clarke Hall (University College London)
-
11
-
15:30
-
Poster Session 61/1-201 - Pas perdus - Not a meeting room -
-
17:30
Welcome Drink 61/1-201 - Pas perdus - Not a meeting room -
-
-
-
EuCAIF joint session: FAIR and sustainable machine learning 222/R-001
-
14
Session program: see EuCAIF indico on https://indico.cern.ch/e/EuCAIF-FAIR-Sustainable-kickoff.
This session concerns FAIR and sustainable AI co-organised with the EuCAIF Working Group WG3 (https://eucaif.org/activities/), with contributions from CERN and from experts in FAIR AI from life sciences. This will be followed by an interactive Q&A session. The agenda is here: https://indico.cern.ch/e/EuCAIF-FAIR-Sustainable-kickoff.
-
14
-
10:30
-
-
15
Robotics & Reinforcement LearningSpeaker: Antonin Raffin (German Aerospace Center (DLR))
-
15
-
12:00
-
-
16
Anomaly Detection techniques applied to the Quality Control of new detector components
In High Energy Physics (HEP), new discoveries can be enabled by the development of new experiments and the construction of new detectors. Nowadays, many experimental projects rely on the deployment of new detection technologies to build large scale detectors. The validation of these new technologies and their large scale production require an extensive effort in terms of Quality Control.
In order to improve the reliability and efficiency of the Quality Control (QC) of new detector components, We propose a new framework based on advanced Machine Learning techniques. Our efforts are focused on the Visual Inspection of such components to help prevent future failures and improve fabrication processes. Our tool aims to combine two complementary algorithms based on Anomaly Detection techniques and Computer Vision algorithms to facilitate the identification of a wide range of defects. This framework has been tested in the context of the production of pixel modules for the new Inner Tracker (ITk) to be deployed in the ATLAS detector for the High Luminosity upgrade. We will show the current development status and successful integration to the QC procedure of the pixel module produced in Japan.
Speaker: Louis Vaslin (KEK High Energy Accelerator Research Organization (JP)) -
17
Latest improvements to CATHODE
The search for physics beyond the Standard Model remains one of the primary focus in high-energy physics. Traditional searches at the LHC analyses, though comprehensive, have yet to yield signs of new physics. Anomaly detection has emerged as a powerful tool to widen the discovery horizon, offering a model-agnostic path as way to enhance the sensitivity of generic searches not targeting any specific signal model. One of the leading methods, CATHODE - Classifying Anomalies THrough Outer Density Estimation (arXiv:2109.00546) is a two-step anomaly detection framework that constructs an in-situ background estimate using a generative model, followed by a classifier to isolate potential signal events.
We present the latest developments to the CATHODE method, aimed to increase its robustness broadening its applicability. These improvements expand its reach to new topologies with new input variables covering all particles in the event.Speaker: Chitrakshee Yede (Hamburg University (DE)) -
18
Weakly supervised signal detection for RPV SUSY multijet inclusive search
R-parity violating (RPV) SUSY introduces a wide variety of couplings, making it essential to search without limiting target channels and cover signatures as broadly as possible. Among such signatures, multijet final states offer high inclusivity and are especially well-suited for model-independent searches targeting RPV SUSY scenarios.
In this study, we develop a signal discrimination method based on Classification Without Labels (CWoLa), a weakly supervised learning framework. While CWoLa has been successfully applied to dijet resonance searches, extending it to multijet events presents unique challenges. These include the variable number of jets in each event and the broader, less distinct mass peaks caused by reduced mass reconstruction resolution, which prevent direct application of existing techniques.
In this presentation, we propose a model architecture and a loss function that utilizes attention mechanisms to achieve permutation invariance, enabling the model to handle events with varying jet multiplicities naturally. Additionally, we show tailored training sample construction strategy designed to mitigate the specific difficulties of multijet events.
Speaker: Takane Sano (Kyoto University (JP)) -
19
Supervised contrastive learning for streamlined analysis & anomaly detection
Contrastive learning (CL) has emerged as a powerful technique for constructing low-dimensional yet highly expressive representations of complex datasets, most notably images. Augmentation-based CL — a fully self-supervised strategy — has been the dominant paradigm in particle physics applications, encouraging a model to learn useful features from input data by promoting insensitivity to irrelevant features (e.g. rotations). In this talk, we present recent work applying the supervised contrastive learning (SCL) paradigm to learn low-dimensional embeddings of jets. SCL explicitly uses class labels in its training objective, making it a natural choice for particle physics where ML algorithms are typically trained on Monte Carlo simulations with unambiguous labels. We show that SCL learns well-structured embeddings for jets that can be used very effectively for downstream tasks such as anomaly detection or traditional supervised analysis. We also discuss preliminary work towards promoting domain adaptation capabilities in the embedding models, wherein the effects of known discrepancies between simulated training data and real LHC data are mitigated in the learned space.
Speaker: Samuel Kai Bright-Thonney (Massachusetts Inst. of Technology (US)) -
20
Contrastive Normalizing Flows for Uncertainty-Aware Parameter Estimation
Abstract The fields of High-Energy physics (HEP) and machine learning (ML) converge on the challenge of uncertainty-aware parameter estimation in the presence of data distribution distortions, described in their respective languages --- systematic uncertainties and domain shifts. We present a novel approach based on Contrastive Normalizing Flows (CNFs), which achieved top performance on the HiggsML Uncertainty Challenge. Building on the insight that a binary classifier can approximate the model parameter likelihood ratio, $\frac{P(x_i|\theta_1,)}{P(x_i|\theta_2,)}$ we address the practical limitations of expressivity and the high cost of simulating high-dimensional parameter grids—by embedding data and parameters in a learned CNF mapping. This mapping models a unique and tunable contrastive distribution that enables robust classification under shifted data distributions. Through a combination of theoretical analysis and empirical evaluations, we show that CNFs, when coupled with a classifier and proper statistics, provide principled parameter estimation and uncertainty quantification through robust classification.
Context This is the method paper for a top-performing solution to the Higgs Uncertainty Challenge (https://arxiv.org/abs/2410.02867). This will also be presented at the Fair Universe HiggsML Uncertainty CERN workshop.
Speaker: Ibrahim Elsharkawy (University of Illinois Urbana-Champaign) -
21
Anomaly preserving contrastive neural embeddings for end-to-end model-independent searches at the LHC
Anomaly detection — identifying deviations from Standard Model predictions — is a key challenge at the Large Hadron Collider due to the size and complexity of its datasets. This is typically addressed by transforming high-dimensional detector data into lower-dimensional, physically meaningful features. We tackle feature extraction for anomaly detection by learning powerful low-dimensional representations via contrastive neural embeddings. This approach preserves potential anomalies indicative of new physics and enables rare signal extraction using novel machine learning-based statistical methods for signal-independent hypothesis testing. We compare supervised and self-supervised contrastive learning methods, for both MLP- and Transformer-based neural embeddings, trained on the kinematic observables of physics objects in LHC collision events. The learned embeddings serve as input representations for signal-agnostic statistical detection methods in inclusive final states, achieving over ten fold improved detection performance over the original feature representation and up to four fold improvement over using a physics-informed selections of the same dimensionality. We achieve significant improvement in discovery power for both rare new physics signals and rare Standard Model processes across diverse final states, demonstrating its applicability for efficiently searching for diverse signals simultaneously. We study the impact of architectural choices, contrastive loss formulations, supervision levels, and embedding dimensionality on anomaly detection performance. We show that the optimal representation for background classification does not always maximize sensitivity to new physics signals, revealing an inherent trade-off between background structure preservation and anomaly enhancement. Our findings demonstrate that foundation models for particle physics data hold significant potential for improving neural feature extraction, enabling scientific discovery in inclusive final states at collider experiments.
Speaker: Kyle Sidney Metzger
-
16
-
16:00
-
-
22
Integrating Energy Flow Networks with Jet Substructure Observables for Enhanced Jet Quenching Studies
The phenomena of Jet Quenching, a key signature of the Quark-Gluon Plasma (QGP) formed in Heavy-Ion (HI) collisions, provides a window of insight into the properties of the primordial liquid. In this study, we evaluate the discriminating power of Energy Flow Networks (EFNs), enhanced with substructure observables, in distinguishing between jets stemming from proton-proton (pp) and jets stemming from HI collisions. This work is a crucial step towards separating HI jets that were quenched from those with little or no modification by the interaction with the QGP on a jet-by-jet basis. We trained simple Energy Flow Networks (EFNs) and further enhanced them by incorporating jet observables such as N-Subjettiness and Energy Flow Polynomials (EFPs). Our primary objective is to assess the effectiveness of these approaches in the context of Jet Quenching, exploring new phenomenological avenues by combining these models with various encodings of jet information. Initial evaluations using Linear Discriminant Analysis (LDA) set a performance baseline, which is significantly enhanced through simple Deep Neural Networks (DNNs), capable of capturing non-linear relations expected in the data. Integrating both EFPs and N-Subjettiness observables into EFNs results in the most performant model over this task, achieving state-of-the-art ROC AUC values of approximately 0.84. This significant performance is noteworthy given that both medium response and underlying event contamination effects on the jet are taken into account. These results underscore the potential of combining EFNs with jet substructure observables to advance Jet Quenching studies and adjacent areas, paving the way for deeper insights into the properties of the QGP. Results on a variation of EFNs, Moment EFNs (MEFNs), which can achieve comparable performance with a more manageable and, in turn interpretable, latent space, will be presented.
Speaker: João A. Gonçalves (LIP - IST) -
23
Automatizing the search for mass resonances using BumpNet
The search for resonant mass bumps in invariant-mass histograms is a fundamental approach for uncovering Beyond the Standard Model (BSM) physics at the LHC. Traditional, model-dependent analyses that utilize this technique, such as those conducted using data from the ATLAS detector, often require substantial resources, which prevent many final states from being explored. Modern machine learning techniques, such as normalizing flows and autoencoders, have facilitated such analyses by providing various model-agnostic approaches; however many methods still depend on background and signal assumptions, thus decreasing their generalizability.
We present BumpNet, a convolutional neural network (CNN) that predicts log-likelihood significance values in each bin of smoothly falling invariant-mass histograms, enhancing the search for resonant mass bumps. This technique enables a model-independent search of many final states without the need for traditional background estimation, making BumpNet a powerful tool for exploring the many unsearched areas of the phase space while saving analysis time. Trained on a dataset consisting of realistic smoothly-falling data and analytical functions, the network has produced encouraging results, such as predicting the correct significance of the Higgs boson discovery, agreement with a previous ATLAS dilepton resonance search, and success in realistic Beyond the SM (BSM) scenarios. We are now training and optimizing BumpNet using ATLAS Run 2 Monte Carlo data, with the ultimate goal of performing general searches on real ATLAS data. These encouraging results highlight the potential for BumpNet to accelerate the discovery of new physics.
Related work at JHEP02(2025)122.
Speaker: Ethan James Meszaros (Université de Montréal (CA)) -
24
Reinforcement Learning for background determination in particle physics
Experimental studies of 𝑏-hadron decays face significant challenges due to a wide range of backgrounds arising from the numerous possible decay channels with similar final states. For a particular signal decay, the process for ascertaining the most relevant background processes necessitates a detailed analysis of final state particles, potential misidentifications, and kinematic overlaps which, due to computational limitations, is restricted to the simulation of only the most relevant backgrounds. Moreover, this process typically relies on the physicist’s intuition and expertise, as no systematic method exists. This work presents a novel approach that utilises Reinforcement Learning to overcome these challenges by systematically determining the critical backgrounds affecting 𝑏-hadron decay measurements. Our method further incorporates advanced Artificial Intelligence models and techniques to enhance background identification accuracy: a transformer model is employed to handle token sequences representing decays, a Graph Neural Network is used for predicting Branching Ratios (BRs), and Genetic Algorithms are utilised as an auxiliary tool to efficiently explore the action space, among others.
Speaker: Guillermo Hijano Mendizabal (University of Zurich (CH)) -
25
Evaluating Two-Sample Tests for Validating Generators in Precision Sciences
Deep generative models have become powerful tools for alleviating the computational burden of traditional Monte Carlo generators in producing high-dimensional synthetic data. However, validating these models remains challenging, especially in scientific domains requiring high precision, such as particle physics. Two-sample hypothesis testing offers a principled framework to address this task. We propose a robust methodology to assess the performance and computational efficiency of various metrics for two-sample testing, with a focus on high-dimensional datasets. Our study examines tests based on univariate integral probability measures, namely the sliced Wasserstein distance, the mean of the Kolmogorov-Smirnov statistics, and the sliced Kolmogorov-Smirnov statistic. Additionally, we consider the unbiased Fréchet Gaussian Distance and the Maximum Mean Discrepancy. Finally, we include the New Physics Learning Machine, an efficient classifier-based test leveraging kernel methods. Experiments on both synthetic and realistic data show that one-dimensional projection-based tests demonstrate good sensitivity with a low computational cost. In contrast, the classifier-based test offers higher sensitivity at the expense of greater computational demands.
This analysis provides valuable guidance for selecting the appropriate approach—whether prioritizing efficiency or accuracy. More broadly, our methodology provides a standardized and efficient framework for model comparison and serves as a benchmark for evaluating other two-sample tests.Speaker: Samuele Grossi (Università degli studi di Genova & INFN sezione di Genova)
-
22
-
-
-
-
26
MLOps infrastructure & best practicesSpeaker: Ricardo Rocha (CERN)
-
26
-
10:30
-
- 27
-
12:30
-
-
28
DINAMO: Dynamic and INterpretable Anomaly MOnitoring for Large-Scale Particle Physics Experiments
Ensuring reliable data collection in large-scale particle physics experiments demands Data Quality Monitoring (DQM) procedures to detect possible detector malfunctions and preserve data integrity. Traditionally, this resource-intensive task has been handled by human shifters who may struggle with frequent changes in operational conditions. Instead, to simplify and automate the shifters' work, we present DINAMO: a dynamic and interpretable anomaly detection framework for large-scale particle physics experiments in time-varying settings [1]. Our approach constructs evolving histogram templates with built-in uncertainties, featuring both a statistical variant - extending the classical Exponentially Weighted Moving Average (EWMA) - and a machine learning (ML)-enhanced version that leverages a transformer encoder for improved adaptability and accuracy.
Both approaches are studied using comprehensive synthetic datasets that emulate key features of real particle physics detectors. Validations on a large number of such datasets demonstrate the high accuracy, adaptability, and interpretability of these methods, with the statistical variant being commissioned in the LHCb experiment at the Large Hadron Collider, underscoring its real-world impact.[1] A. Gavrikov, J. García Pardiñas, and A. Garfagnini, DINAMO: Dynamic and INterpretable Anomaly MOnitoring for Large-Scale Particle Physics Experiments (2025). Link: https://arxiv.org/abs/2501.19237
Speaker: Arsenii Gavrikov -
29
Autoencoder-based time series anomaly detection for ATLAS Liquid Argon calorimeter data quality monitoring
The ATLAS detector at the LHC has comprehensive data quality monitoring procedures for ensuring high quality physics analysis data. This contribution introduces a long short-term memory (LSTM) autoencoder-based algorithm designed to identify detector anomalies in ATLAS liquid argon calorimeter data. The data is represented as a multidimensional time series, corresponding to statistical moments of energy cluster properties. The model is trained in an unsupervised fashion on good-quality data and is evaluated to detect anomalous intervals of data-taking. The liquid argon noise burst phenomenon is used to validate the approach. The potential of applying such an algorithm to detect arbitrary transient calorimeter detector issues is discussed. The work described here is publicly available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-DAPR-PUB-2024-002/.
Speaker: Vilius Čepaitis (Université de Genève (CH)) -
30
Challenges Deploying a Hybrid PVFinder Algorithm for Primary Vertex Reconstruction in LHCb’s GPU-Resident HLT1
The PVFinder algorithm employs a hybrid deep neural network (DNN) approach to reconstruct primary vertices (PVs) in proton-proton collisions at the LHC, addressing the complexities of high pile-up environments in LHCb and ATLAS experiments. By integrating fully connected layers with a UNet architecture, PVFinder’s end-to-end tracks-to-hist DNN processes charged track parameters to predict PV positions, achieving efficiencies above 97% and false positive rates as low as 0.03 per event in LHCb, surpassing conventional heuristic methods. We present the current status of embedding PVFinder into LHCb’s Allen framework, a fully software-based, GPU-optimized first-level trigger system for Run 3, handling 30 MHz of beam crossing data. Key challenges include optimizing computational efficiency and model integration within Allen’s real-time constraints. For ATLAS, PVFinder matches the Adaptive Multi-Vertex Finder’s efficiency while improving vertex-vertex resolution (0.23–0.37 mm vs. 0.76 mm). Future efforts target ATLAS ACTS integration and graph neural network enhancements.
Speaker: Mohamed Elashri (University of Cincinnati)
-
28
-
15:30
-
Poster Session 61/1-201 - Pas perdus - Not a meeting room -
-
31
SMARTHEP Meets Industry @CERN https://indico.cern.ch/event/1518429/contributions/6431628/ 500/1-001 - Main Auditorium
https://indico.cern.ch/event/1518429/contributions/6431628/
-
19:30
Apero with the SMARTHEP network and Confindustria Piedmont 61/1-201 - Pas perdus - Not a meeting room -
-
-
-
-
32
Summary of Calo Challenge
Claudius Krause studied physics in Cottbus, Munich, and Lausanne. He received his doctorate in 2016 from Ludwig-Maximilians University Munich, working on “Higgs Effective Field Theories - Systematics and Applications”. He was a postdoctoral researcher at IFIC Valencia in Spain, Fermilab and Rutgers University in the USA, and the University of Heidelberg in Germany. About 6 years ago, he started working on the application of machine learning techniques to particle physics. He joined HEPHY as junior group leader for “Machine Learning for Particle Physics” in October 2023.
Speaker: Dr Claudius Krause (HEPHY Vienna (ÖAW))
-
32
-
10:00
-
-
33
Robust Model Selection for Deep Learning
Machine learning (ML) models are increasingly being used in high-energy physics. However, the selection and training of these models frequently involves human intervention, extensive hyperparameter tuning, and consideration of data changes. These challenges become particularly pronounced when developing models for automated pipelines or fault-tolerant systems. We introduce a novel, automated approach for assessing and selecting robust ML models, specifically deep neural networks, where robustness is defined as the degree of performance loss variability for different training samples and initialisations.
Our method evaluates the variability in losses for multiple model instances and incorporates a meta-algorithm for selecting high-performing yet robust models.
We apply the model selection algorithm to neural networks with a few convolutional and fully connected layers (with a number of parameters no more than 30,000) for two regression problems. Overall, we systematically analysed 6,912 model architectures, training over 40,000 model instances, to find out how training sample size and initialisation of weights affect robustness. We demonstrate that selected models outperform Neural architecture search (NAS) utilising Bayesian optimisation in both robustness and performance, offering an effective strategy for reliable model deployment under challenging environments. The proposed approach is model-agnostic and suitable for integration into AutoML pipelines — an important step toward automated, scalable, and trustworthy ML in HEP and beyond.Speaker: Dr Alexey Boldyrev -
34
Automated Model Building with Reinforcement Learning: An Application to Neutrino Model Building
To explain Beyond the Standard Model phenomena, a physicist has many choices to make in regards to new fields, internal symmetries, and charge assignments, collectively creating an enormous space of possible models. We describe the development and findings of an Autonomous Model Builder (AMBer), which uses Reinforcement Learning (RL) to efficiently find models satisfying specified discrete flavor symmetries and particle content. Aside from valiant efforts by theorists following their intuition, these theory spaces are not deeply explored due to the vast number of possibilities and the time-consuming nature of building and fitting a model for a given symmetry group and particle assignment. The lack of any guarantee of continuity or differentiability prevents the application of typical machine learning approaches. We describe an RL software pipeline that interfaces with newly optimized versions of physics software, and apply it to the task of neutrino model building. Our agent learns to find fruitful regions of theory space, uncovering new models in commonly analyzed symmetry groups, and exploring for the first time previously unexamined symmetries.
Speaker: Jake Rudolph -
35
Scalable Multi-Task Learning for Event Reconstruction with Heterogeneous Graph Neural Networks
The growing luminosity frontier at the Large Hadron Collider is complicating the reconstruction of heavy-hadron collision events both at data acquisition and offline levels with rising particle multiplicities challenging stringent latency and storage requirements. This talk presents significant architectural advancements in Graph Neural Networks (GNNs) aimed at enhancing event reconstruction in high-energy physics. These advancements are implemented and evaluated within the context of expanding the deep full event interpretation (DFEI) framework [García Pardiñas, J., et al. Comput. Softw. Big Sci. 7 (2023) 1, 12], which targets the hierarchical reconstruction of B-hadron decays within the hadronic collision environment of the LHCb experiment.
Specifically, we introduce a novel end-to-end Heterogeneous Graph Neural Network (HGNN) architecture, which allows for unique representations for several particle collision relations and features integrated edge and node pruning layers. The HGNN is trained using a multi-task paradigm, which not only significantly enhances the B-hadron reconstruction performance but also simultaneously enables primary vertex association and graph pruning tasks within a single, unified model. We will discuss the performance improvements achieved, quantifying both the reconstruction accuracy and the effectiveness of the pruning. Furthermore, we propose a weighted message passing scheme designed to improve the model's inference time scalability with minimal performance loss, a key consideration for deployment in high-throughput environments.
Speaker: William Sutcliffe (University of Zurich (CH)) -
36
PQuant: A Tool for End-to-End Hardware-Aware Model Compression
Machine learning model compression methods such as pruning and quantization are critical for enabling efficient inference on resource-constrained hardware. Compression methods are developed independently, and while some libraries attempt to unify these methods under a common interface, they lack integration with hardware deployment frameworks like hls4ml. To bridge this gap, we present PQuant, a Python library that streamlines the training and compression of machine learning models. PQuant offers an interface for applying diverse pruning and quantization methods, making it accessible to users without deep expertise in compression while still supporting advanced configuration. Notably, integration with hls4ml is ongoing, which will enable deployment of compressed models to FPGA-based accelerators. This will make PQuant a practical tool for both researchers exploring compression strategies and engineers aiming for efficient inference on edge devices and custom hardware platforms.
Speaker: Roope Oskari Niemi
-
33
-
37
-
38
Speaker: Michał Mazurek (National Centre for Nuclear Research (PL))
-