AI RCS Strategy Workshop
The AI RCS Strategy Workshop is a community-driven initiative within CERN’s Research and Computing Sector (RCS). Its goal is to collect and discuss project proposals from the community, following the RCS AI strategy, presented at a general meeting in July.
Projects descriptions can be submitted as abstracts. Additional information (proponents, corresponding activity area, goals & timeline, resources, etc.) is required as indicated in the submission form.
-
-
1
-
Cutting Edge AI for Offline Data Processing 500/1-001 - Main Auditorium
-
2
Neutrino Vision: AI for Neutrino Event Reconstruction
We propose to leverage artificial intelligence to advance event reconstruction in neutrino detectors. The first focus of the project is on atmospheric neutrino interactions in liquid argon detectors such as DUNE. These events often involve invisible particles like neutrons, yet kinematic correlations between visible and invisible final states enable robust reconstruction of the energy and direction of the primary neutrino. Building on our recent theoretical study (https://arxiv.org/abs/arXiv:2405.15867), which demonstrated promising results with a simple multi-layer perceptron, and on experimental benchmarks from MicroBooNE (https://arxiv.org/abs/2504.17758), we will develop AI-based methods to enhance energy and angular resolution in liquid argon time projection chambers.
In the second part, we turn to neutrino telescopes such as IceCube and KM3NeT. Traditional analyses classify events only as track-like ($\nu_\mu$ CC) or shower-like ($\nu_e$ CC, $\nu_\tau$ CC, NC). While IceCube has already applied machine learning to improve flavor identification, we aim to go further by developing AI techniques for a more detailed classification of event substructure. In particular, we will explore discrimination between neutrino and antineutrino interactions, charged- and neutral-current events, as well as signatures of physics beyond the Standard Model.
Together, these studies will establish AI as a powerful framework for extracting maximal information from the diverse range of neutrino observatories.
Speaker: Joachim Kopp (CERN) -
3
Object Identification using AI/DL trained on data
The usage of ML techniques in classification problems has become since long time a successful standard widely used across measurements and searches at colliders. However, most of the applications rely on simulated samples for training, while there is little or no experience of training based on data. While simulated samples provide nowadays very accurate modelling of physics signals and backgrounds, they do not perform equally well for backgrounds related to misidentified objects. The idea of this project is to investigate applications of trainings based on data samples, in particular in the application to physics objects identification.
The project will focus on muon identification as a first application. In the case of muons it is possible to isolate sufficiently pure data samples of real muons and misidentified muons by requiring pairs of candidate muons with dilepton invariant mass near the Z-boson peak and either opposite-sign or same-sign charge. A data trained muon identification has the potential to enable high QCD background rejection with small loss in efficiency, enhancing the precision of W boson measurements in the muon decay channel such as the W charge asymmetry and the W-boson mass. The project will further study the potential application of data training to electron and b-jet identification, where the main challenge is related to the selection of relatively pure samples of misidentified objects with reduced contamination from real electron or b-jets.
This project can be regarded as a multi-dimensional research phase space: the data dimension will tacke the feasibility of extracting pure and balanced enough training data for the required signal and background signatures from ATLAS data, the second dimension will cover the research of optimal classification architectures, and finally this should include a parameter optimisation steps in order to maximise the model performance.
Speaker: Stefano Camarda (CERN) -
4
Transformer Model for Global PID
Particle identification is a major task in any high energy physics experiment. With the challenging environments encountered in Run 3 & 4 particle identification for the ALICE TPC tracks has become a machine learning based task, showing significant improvements and flexibility compared to previous approaches. This projects aims to extend this idea in an experiment-agnostic way.
All LHC experiments utilize different detection systems for various purposes. Improving momentum resolutions, energy measurements or detections of weakly interacting particles, all detectors share a common goal: Separating and identifying particles. With many detectors organized in close spatial proximity to each other, their information is often correlated and can be combined. Bayseian PID is a classic way of approaching this problem, while it still relies on individual identifications made within each detection system. Multiheaded attention mechanisms can in contrast make use of a flexible set of input information. This information can even be incomplete or partial. Extending this thought, information from run conditions, detector conditions or even LHC specific information can be passed to a global machine learning algorithm which combines the information and creates a PID prediction as an output. Concretely, the input and output can be formulated as a set of tokens, like
Input: {(ITS, pT = ...), (ITS, eta = 0), ..., (TPC, dE/dx = ...), ..., (FT0, occupancy = ...), ..., (LHC, beamtype = ...)}
Output: {(PID = [0-N]), (Quality = [0,1]), ...}This approach can be implemented with minimal effort on Monte-Carlo to prove the concept. It can then be extended in a data-driven way using cleaned environments, fine-tuning the model with real data to the use case (using e.g. V0, high-pT particles, cosmic tracks). A first step towards such a project was investigated within the collaboration (https://link.springer.com/article/10.1140/epjc/s10052-024-13047-3) but never reached production level application as refinement on the technique and quality assurance are needed.
Speaker: Christian Sonnabend (CERN, Heidelberg University (DE)) -
5
Unsupervised Object Identification for Particle-Detector Data
Assigning detector signals to individual particle objects is a central yet highly complex pattern-recognition problem in high-energy physics. Although conceptually similar to object detection in images—a domain where deep learning excels—such methods are still rarely applied to hit clustering in particle detectors.
Within our project, we investigated the main challenges of using deep learning for object identification and hit clustering for particle-detector systems and developed a method to overcome them using an unsupervised-learning approach and an Implicit Rank Minimizing Autoencoder. To evaluate and compare its performance against both classical techniques, such as Hough transformations, and other deep-learning approaches, we developed ScCLEVR, a flexible training and validation framework. ScCLEVR covers a range of reconstruction tasks at varying levels of complexity, including straight-line track finding in tracking detectors, ring reconstruction in RICH detectors, and shower clustering in calorimeter data, to encourage architecture sharing and comparison across tasks.
ScCLEVR shall be made publicly available to foster collaboration and accelerate AI-driven innovation for the common task of object-identification in particle-detector data.Speaker: Dr Thomas Poschl (CERN) -
6
AI/DL-based track reconstruction in non-Standard applications
Track reconstruction in high-energy physics experiments make use of algorithms based on least-squares techniques such as the Kalman Filter (KF) or the Global Chi2 fitter for track fitting, and the Combinatorial Kalman Filter (CKF) for track finding. These approaches exploit the fact that many sources of uncertainty can be approximated as normally distributed, and they perform extremely well for the bulk of reconstructed tracks. In such regions of phase space, classical methods already achieve close to optimal performance, leaving little room for improvement from alternative approaches.
However, there are important classes of tracks where these assumptions break down and reconstruction performance remains limited. Examples include electrons undergoing significant Bremsstrahlung, low-momentum particles that curl and scatter multiple times in the tracker, and exotic signatures such as disappearing or displaced tracks targeted by long-lived particle searches. In the first class of these cases, the hit patterns deviate from the Gaussian regime: energy loss and scattering effects produce heavy-tailed residual distributions, while multiple turns and large angular deflections challenge conventional seeding strategies and the $\chi^{2}$-based pruning of CKF. Handling these tracks often requires loosened cuts in both seeding and track finding all of which increase combinatorics and computational cost without fully recovering efficiency.
Recent developments in machine learning, and in particular graph neural networks (GNNs) and LLM-based track finding approaches, provide new opportunities to improve track finding in precisely these difficult scenarios. The GNN demonstrator for the ATLAS ITk has shown promising efficiency for electrons, retaining hits beyond strong scatter points that would otherwise be discarded, without particular tuning and could possibly capture the global context of looping low transverse momentum trajectories. Similarly, learned models can target disappearing or displaced tracks by reasoning over unconventional hit patterns, rather than relying on rigid seed definitions. By complementing or replacing parts of the classical reconstruction chain, AI/DL methods have the potential to improve physics reach in challenging regions of phase space, while also reducing algorithmic complexity and the associated computational overhead.
Speaker: Benjamin Huth (CERN) -
7
End-to-end track reconstruction
Recent years have seen the rise of first close to competitive DL based track finding algorithms, most prominently through the use of Graph neural networks (GNNs). GNN have become the first competitive AI/DL architecture that comes close in physics performance to classical algorithms, effectively exploiting the capabilities of accelerators, mainly on GPUs but in recent research also on FPGAs. However, the GNN architecture has certain drawbacks and limitations: the current GNN based architecture requires significant pre- and postprocessing steps, most prominently the graph building at input and graph segmentation at output, which themselves are bound to ad-hoc and often very experiment specific software. GNNs also miss an internal representation of a ‘track‘, the desired track-like outputs that require significant postprocessing (as the GNN operates on edges) and fails to easily integrate concepts as shared hits on track.
Finally, current GNN models are built on three-dimensional point clouds and show poor performance on measurement technologies that can constrain only one measurement dimension (such as strips or straw detectors). A further missing step in current GNN based models is the track parameter regression, which is - in general - still done using least squares estimators or certain derivatives as such. From the perspective of computational performance, GNNs in general suffer from irregular memory access patterns which inhibit using the full memory bandwidth of modern GPUs.
Based on the expertise gathered with GNN models, we propose an extended R&D project to explore different architectures such as maskformers or similar algorithms to develop new, heuristic end-to-end track reconstruction models, particularly aimed at high track multiplicity environments. The aim is to overcome some of the aforementioned limitations and thus help to shape the next generation AI/DL based track reconstruction algorithms. If time permits, further research should expand to include calorimetric information and add vertex estimation algorithms.
Speaker: Noemi Calace (CERN) - 8
- 9
-
10
GNNs and Transformers for reconstruction tasks in CMS
Graphs and Transformers have shown a huge potential to improve reconstruction tasks. The improvement of jet tagging testifies how these architectures are particularly suitable for HEP problems. Nevertheless, the applications are still limited in scope. We propose to exploit GNNs and Transformers on multiple tasks, such as unified object reconstruction, vertex reconstruction, lepton identification in pp and Heavy Ion data, consolidating the software infrastructure (data preparation, training, calibration with data, systematic uncertainty assessment), scaling up the architecture towards beyond >>100M to 1B parameters, and expanding the physics scope towards soft-pT regimes (e.g., for jet tagging and missing energy for precision physics).
Speaker: Sebastian Wuchterl (CERN) -
11
Enhance jet energy measurements using machine learning tools for ATLAS data analysis and on-line triggering
The precise calibration of jets can significantly boost the exploitation of the ATLAS data in the LHC run-3 and in the high luminosity phase. The precise knowledge of the jet energy scale (JES) is important for precision measurements like the top quark mass, better jet energy resolution and enhanced pile-up suppression can improve the discovery and measurement of Di Higgs production.
Recently the ATLAS collaboration has significantly improved (arXiv:2407.15627) the jet energy scale uncertainty (0.3% at 300 GeV) by including single particle detector response measurements and an improved treatment of the dependence of the JES on the jet fragmentation (arXiv:2405.20206). A JES of 0.1% over a wide jet transverse momentum range (pT) seems in reach for the analysis of Run-4 data when including new AI based methods. These methods make use of jet constituents like calorimeter clusters or particle flow objects) instead of simple jet kinematic like transverse momentum and rapidity).
Several methods have been proposed demonstrating high potential for improving the jet resolution (JER) and pile-up suppression for high pT jets. However, so far none of these techniques is used in any LHC experiment. The aim of the project is to use machine learning based methods for the jet calibration as used by ATLAS. Such methods can also be used for jet calibration in the ATLAS trigger for Run-4 in particular to enhance the low-pT multijet jet triggers which are presently the limiting factor for triggering DiHiggs production in the bbbb and bbtautau channel. The gained expertise can be beneficial to all LHC experiments and future experiments at CERN.
ATLAS recently has successfully used Bayesian neural networks (BNN) for the calibration of the calorimeter energy deposition (arXiv:2412.04370). This has high potential to improve the JER, but has not yet been employed on data. In such an approach also the uncertainties on the inferred cluster energies can be exploited for jet calibration.
A particular difficulty in jet calibration is the interplay of the mean energy measurement and the resolution in the calibration process. This has been recently addressed by a Gaussian Ansatz model (arXiv:2205.03413) that uses for deep neutral networks (DNN) or particle-flow networks (PFN) based on graph convolutional networks (arXiv:1810.05165) that process unordered sets of jet constituents. PFN networks are the main drive of the striking recent improvement in jet tagging, but their use for jet calibration has not yet been exploited. This approach can be combined with very recent work using foundation models using supervised representative training (arXiv:2404.16091). Such models can potentially reduce the size of the data sets needed for the training. Collaboration with IT using the in house model will be beneficial.
Recent machine learning techniques can be used to train separately on data and Monte Carlo simulations to improve JES and JER (arXiv:2402.14067) exploiting the momentum balance between to physics objects in a 2-2 hard scattering process.
For trigger application an interesting new concept is explainable AI (xAI) which addresses the internal representations of trained, highly performant networks. The idea is to express the network output with a moderate performance loss and turn the interference into simple analytical expressions (arXiv:2507.21214). First the latent (input) features are compressed principle component analysis to get the linear correlations. Non-linear correlations can be compressed using disentangled latent classifier networks that compress the latent features, but also optimised on the target performance. The relation between the compressed input variables and the network output is then approximated by formula using symbolic regression as it is available in PySR. Such an approach is interesting for inference time critical applications. Moreover, technically it could already be used in the ATLAS Phase-1 trigger system.
Speaker: Tancredi Carli (CERN) -
12
Integration and validation of ML-based particle flow (MLPF) in Phase-2 TICL reconstruction
Given the progress and promising results on ML-based particle flow integration with CMS offline reconstruction in a Run 3 setup (CMS-PFT-25-001), we now aim to extend and integrate MLPF with Phase-2 TICL reconstruction as a plug-in. In Run 3, we demonstrated that with a small transformer model, events containing on the order of ~5’000 tracks and clusters can be reconstructed into final-state particles on a modern inference GPU (e.g. NVIDIA L4) within a few tens of milliseconds, while also improving jet performance compared to the baseline PF algorithm. For Phase-2, we will evaluate to what extent this approach can be applied in an online reconstruction setting, where the initial local reconstruction is provided by existing clustering algorithms in TICL. In analogy to the Run 3 setup, the inputs will be local clusters in detector layers (e.g. reconstructed tracksters), while the target objects will be simulation-based particles (e.g. simulated tracksters). Performance will be assessed using the standard TICL and PF metrics, namely particle-level efficiencies, fake rates and resolutions measured against simulation, as well as jet resolution, matching efficiency and fake rate, and MET resolution. Finally, since the Run 3 setup already enables pileup mitigation through an additional PU-probability output node integrated in the model, we will also explore extending this capability to Phase-2.
Speakers: Farouk Mokhtar (Univ. of California San Diego (US)), Joosep Pata (National Institute of Chemical Physics and Biophysics (EE)), Marco Rovere (CERN) -
13
Algorithms for sequential ML-based Particle Flow Reconstruction
Significant effort has recently been invested into developing tools for end-to-end Particle Flow (PF) reconstruction, both at LHC experiments and at Future Colliders. A major advantage of this approach is the potential for developing a tool which is detector agnostic. However, such approaches typically focus on higher level objects (tracks and calorimeter clusters), as operating directly on all hits in an event, particularly in very highly granular detectors, becomes unfeasible. Other major challenges include validation and algorithmic robustness, as the algorithm must generalise to unseen event topologies, as well as potentially changing input as detector conditions change.
This project aims to break the problem down into a sequence of algorithms, using both ML and classical algorithms as appropriate. Through the adoption of a hierarchical approach, the framework will be designed to operate on hit level information, gradually building higher level objects. This approach would aid both validation and interpretability, as the performance could be studied at the particle rather than the event level, and linked with specific algorithms. It would also make adapting to changing detector conditions easier, as specific parts could be retrained, as opposed to the entire algorithm.
Could be performed as a step by step modernisation of the existing approaches like Pandora and LCContent. The most challenging parts would need to be identified (under current studies), with likely reclustering steps being the most appropriate ones to start with.
Speaker: Peter McKeown (CERN) -
14
Calibration of object identification algorithms using normalizing flows in CMS
New unbinned, higher-dimensional calibration methods for jet, tau, and lepton ID algorithms, simultaneous in multiple jet categories/flavors, within CMS.
Speaker: Davide Valsecchi (ETH Zurich (CH)) -
15
Design and deployment of calibrated NNs for neutral B mesons tagging
Develop state-of-the-art NN for flavour inference of neutral B mesons for CKM CPV analyses, evolution of work performed in BPH-23-004. NNs are used as probability estimators, so perfect calibration is as important as performance. To be deployed in flagship BPH analyses.
Speaker: Alberto Bragagnolo (CERN) -
16
Event Generation and Reconstruction with the CMS HGCAL using Modern ML
Development of AI-based reconstruction for the CMS High-Granularity Calorimeter using graph- and transformer-based models with TICL objects as inputs and using contrastive learning. In a first step, the work will tackle 2D/3D pattern recognition for particle showers. In a second stage, particle identification will be added using global event properties. The goal is to maximize physics performance (in terms of efficiency, purity, energy scale and resolution, and particle identification) using HGCAL’s unprecedented granularity, while minimising the computational cost of processing data under HL-LHC conditions. Based on this model, develop its generation counterpart for fast and accurate HGCAL simulation
Speaker: Benedikt Maier (K) -
17
An End-to-end solution for event building at the Future Circular Collider: Machine Learning Based Track Reconstruction and Particle Flow
Accurate event reconstruction is essential to fully exploit the physics potential of modern particle physics experiments. Particle Flow (PF) algorithms enhance reconstruction efficiency and resolution by combining information from multiple subdetectors. In particular, high-quality track information significantly contributes to the overall performance of particle object reconstruction.
Future experiments, such as the Future Circular Collider (FCC), are in active development and present new challenges for traditional, detector-specific reconstruction techniques. In response, machine learning–based approaches—notably transformer models—are emerging as powerful alternatives. These architectures offer the flexibility and robustness needed to meet the demands of R&D for next-generation particle flow, tracking and flavor tagging algorithms.
The overall goal of this effort is to develop an end-to-end solution for particle reconstruction at the FCC. Unlike traditional approaches that rely on iterative, stepwise fine-tuning of parameters—an effort prone to cumbersome repetition as reconstruction adapts to ongoing detector development—an end-to-end framework provides a streamlined and unified alternative. A distinguishing feature of our approach is that the machine-learning particle flow (MLPF) algorithm is trained directly on detector hits rather than on pre-reconstructed clusters, thereby offering greater freedom and reducing reliance on intermediate, detector-specific reconstruction stages. In parallel, machine learning–based flavor tagging is being developed to complement particle flow and tracking, providing powerful discrimination of heavy-flavor jets and thereby further enhancing the physics reach of the FCC.
This vision builds on existing work in hit-based MLPF [MLPF note], machine learning–based tracking [ML tracking note] , and flavor tagging [Flavor tagging note]. Work is ongoing to extend these approaches toward a fully integrated event-building pipeline, ultimately aiming to deliver a robust, flexible, and scalable end-to-end reconstruction strategy tailored to the demands of the FCC.Speaker: Dr Lena Maria Herrmann -
18
Large Physics Model: a foundation model for HEP data reconstructions and analysis with CMS data
Train foundation models on various tasks to using supervised and unsupervised techniques (e.g.,approach followed to train large LLMs like chatGPT)
- Jet level: Allow a jet algorithm to self-discover patterns and physics properties on unlabelled data to obtain a pre-trained model unbiased from the Monte-Carlo discrepancy. The final goal would be to obtain a Foundation Model for jets that can be finetuned for the different tasks with minimal data/MC disagreement, allowing for a better post-calibration performance.
- Event level: Develop an event view starting from local clusters in the detector, performing the Particle Flow reconstruction task and adapting this model to various tasks. Develop a set of task-specific algorithms from this main algorithm, through a tuning workflow that could run on a local cluster.
- Analysis level: Learn a high-level object-based event representation of CMS events, to provide a foundation model for data analysis that can be finetuned for better event selection or related analysis tasks such as object assignment.Supervised and unsupervised approaches will be studied.
Speaker: Sebastian Wuchterl (CERN) -
19
federated learning for cross-experiment foundation models
Several reconstruction steps in LHC events are being approached using end-to-end deep learning solutions (e.g., for tracking, calorimetry, and particle flow linking). It has been proposed that foundation models trained on physics events could repeat for LHC event reconstruction the astonishing success of Large Language Models in developing multitasking skills. Such an application could be trained on particle flow reconstruction, starting from raw detector inputs to build a global event view without training supervision (unlike the state-of-the-art MLPF model). First efforts towards this Large Physics Models have shown that encouraging results. The use of federated learning could facilitate these efforts: by simultaneously extracting knowledge from various experimental datasets, one could abstract each detector data to a common space, where the underlying physics could be learned more accurately. The knowledge gained with reconstructing CMS data could improve the ATLAS reconstruction, and viceversa. Federated the experience of the CAFEIN team at CERN on large-scale federated learning, we propose to establish an effort to experiment on the idea of cross-experiment learning and assess its potential to improve the physics performance of the various experiments. The success of this process would offer a platform to transfer knowledge from the LHC experiments to future colliders.
Speaker: Maurizio Pierini (CERN)
-
2
-
10:45
-
Cutting Edge AI for Offline Data Processing 500/1-001 - Main Auditorium
-
20
Improving Fast Hadronic Shower Simulation for ATLAS and Future Calorimeters
Run-3 and HL-LHC analyses require billions of events and numerous systematic variations, making full Geant4 simulation prohibitively slow; a calorimeter fast-simulation offers ~10× speed-up but remains less accurate for hadronic showers, particularly for sub-showers displaced from the shower axis. This project builds on advances from industry-scale image generative AI (e.g., diffusion and transformer models) to raise hadronic-shower accuracy for current ATLAS calorimetry and forthcoming high-granularity designs, while explicitly accounting for memory constraints inherent to large-scale HEP production.
Planned developments include optimised voxelisation alongside point-cloud generative models that preserve fine granularity and are engineered for high throughput in HEP workflows; conditioning on local geometry and materials to handle complex regions (cracks, boundaries); and uncertainty-aware, data-driven tuning that conditions models directly on LHC data within its quoted uncertainties. A rigorous comparison programme against Geant4 and collision data, together with ML-based validation/diff tools, will localise discrepancies across fast-sim variants and document improvements in shower shapes, response, and resolution—at fast-simulation speed.
Speakers: Michael Duehrssen-Debling (CERN), Nedaa Alexandra Asbah (CERN) -
21
Reference Datasets for AI/DL research
Cutting edge AI/DL research, and algorithmic R\&D in general, profits immensely from openly accessible, realistic training data - in the field of HEP often paired and augmented with the relevant ground truth information. Examples for such datasets are the TrackML dataset (https://doi.org/10.1007/s41781-023-00094-w), which counts currently more than 100 citations in other research projects, the CaloChallenge dataset (https://arxiv.org/abs/2410.21611), and other open datasets released by experiments or other sometimes ad-hoc organized groups of researchers. These datasets did not only help to develop and train AI/DL models, but also served as a reference performance gauge but reach their limits in size and features.
We propose a structured survey on existence, needs and requirements for reference datasets in the area of (detector) simulation, triggering, reconstruction, analysis and associated fields at large. Subsequently, we propose the establishment and creation of such reference datasets, including the follow-up datasets for the TrackML and CaloChallange datasets with increased accuracy and extended use cases. It will include the definition and detailed documentation of the contained data and data format.
The datasets will be general, e.g. by using the Open Data Detector, and usable across experiment and the community at large. Furthermore, we propose to create a dedicated entry point at the CERN Open Data portal for searching and retrieving the datasets, together with the relevant information. This should include a versioning system for datasets in order to trace updates to the datasets for the future.
Speaker: Andreas Salzburger (CERN) -
22
Simulation of hadronic showers
While numerous advances have been made in the simulation of electromagnetic showers with generative models, significantly less attention has been given to the simulation of hadronic showers. Simulation of these showers represents a significantly more complicated task, with showers featuring much larger event-to-event fluctuations due to the mix of hadronic and electromagnetic interactions, the covering of a much larger volume of the calorimeters, and the extension of showers across both the electromagnetic and hadronic calorimeters, with potential punch through into the muon system. This is particularly challenging for highly granular calorimeters, where the granularity reveals the fine substructure of showers, resulting in branching tree-like topologies.
Recent work in the form of CaloHadronic [2506.21720] has, for the first time, demonstrated the potential of a point cloud based generative model utilising a transformer mechanism in simulating such showers. While initial results show promising fidelity, challenges remain. In particular, inference time is currently limited by the number of diffusion steps and the transformer architecture, and the models do not yet simulate punch-through particles reaching the muon system. This project aims to extend these capabilities, exploring modern architectures such as MAMBA and point cloud–based approaches, including leveraging outputs from high-granularity clustering projects.
The scope of this work is broadly applicable to multiple experiments, including ATLAS, CMS HGCal, and LHCb.Speaker: Anna Zaborowska (CERN) -
23
Data Representations for Fast Shower Simulation
Recent approaches to fast simulation have proposed directly operating on calorimeter showers in the form of a point cloud. This representation promises much improved efficiency compared to the regular grid based methods that are typically used, particularly for calorimeters which feature a very high granularity. Point clouds are also a very flexible representation, and form a compelling basis upon which to build foundation models for fast simulation, which can be easily adapted to different detector readout granularities.
However, in order to develop a fast simulation tool for a new detector, it is necessary to ensure that the point cloud used for training is appropriate for the given granularity and structure of the target calorimeter. This involves clustering the vast number of individual energy deposits, which are too numerous to train a model on, into a smaller number of points. This task is particularly laborious and currently requires the adoption of ad-hoc methods and significant trial and error. This project seeks to explore a set of tools that could be employed to produce optimised point clouds, with the minimum number of points required to retain the key physics properties of the shower. Successful completion of the project would significantly reduce the time and effort to produce a fast calorimeter simulation tool.
The outcome of this project can be used to formulate a dataset for model training for CMS HGCal, where some models already use a (naive) point cloud representation, as well as for ATLAS EM showers in the endcaps and for hadronic showers (EM showers in the barrel were recently tuned manually to produce an optimal representation).
Speaker: Anna Zaborowska (CERN) -
24
Efficient energy deposit mapping in highly granular calorimeters
This project proposes to investigate, develop, and optimize a tool for placing pre-generated energy deposits (with ML models) into the highly granular CMS High Granularity Calorimeter (HGCal). Depending on the data representation of the deposits (e.g. point positions or voxelized volumes), efficient mapping to the detailed HGCal geometry is critical. The procedure must preserve shower observables while ensuring that the placement step does not become a bottleneck, even given the very large number of voxels involved.
In addition, the project will address challenges related to the volume of training and validation data. Current studies already face limitations from the amount of information that must be passed through to produce and validate calorimeter showers, even for a single point (angle) in the HGCal phasespace.
Validation tools that will be developed within the project to investigate the effect of different placement techniques are also ones that can be used to investigate the accuracy of different ML models, as there are several of them tested for HGCal.Speaker: Anna Zaborowska (CERN) -
25
Extending CaloChallenge: A live benchmark for ML-based calorimeter simulations
This project aims to extend the lifecycle of the CaloChallenge [2411.05996] by turning it into a continuously updated, live benchmark for calorimeter shower simulation. The goal is to provide a long-term benchmark on an ML-challenge platform where new machine learning models can be submitted, evaluated, and compared in a consistent way. By tracking progress over time and highlighting novel approaches, the benchmark will help the community stay informed about advances in the field and support the integration of new methods into high-energy physics workflows.
The CaloChallenge has already proven valuable for fast simulation development across most LHC experiments. It has directly influenced current R&D efforts, with several prototype models either inspired by or adopted from CaloChallenge contributions. This underlines the challenge’s relevance and impact on real-world simulation pipelines.
An important additional development in this extended benchmark is the integration of models into experimental frameworks, to assess their behavior under realistic detector conditions. This has been planned for the next edition of the CaloChallenge, with more data representations, as well as larger statistics, and a more complete validation suite. Inference and placement of showers back in the readout structure is essential for validating the performance of the models.
Speaker: Anna Zaborowska (CERN) -
26
Distillation of Diffusion models (exploration applicable to other models)
Diffusion models are proven to be a good candidate for the generation of calorimeter showers. However, due to their slow inference occurring over several reverse diffusion steps, the base diffusion model often needs to be distilled to perform a single-step or few-step inference. Currently explored consistency distillation [2303.01469] works well for single-step inference over low-granularity calorimeter showers. However, they do not offer the expected tradeoff between sampling steps and accuracy. The project aims to explore newer distillation methods that offer this tradeoff, e.g., [2406.04103] or more advanced distillation methods, e.g., [2311.18828]
Speaker: Anna Zaborowska (CERN) -
27
A robust validation metric for evaluating particle showers
Evaluation of generative models is a difficult task. In the domain of computer vision, the community has started to adopt LPIPS [1801.03924]. This requires a pretrained model capable of perceptually understanding the input. However, in the domain of fast simulation, no such pretrained model exists. Reasons include a lack of a challenging downstream supervised task, a lack of augmentations for contrastive learning, and the stochasticity of the showers. The community primarily relies on physics-based shower observables or a quantitative metric based on those observables. The goal of the project is to train a model capable of perceptually understanding the showers, which can serve as a robust evaluation metric or a loss function to train a generative model. A good initial starting point could be extreme classification [1805.01978], mutual information-based methods [1801.04062], or augmentation-agnostic contrastive learning methods [2011.04419, 2106.10052].
Speaker: Peter McKeown (CERN) -
28
CaloChallengev2 for LHCb
The CaloChallenge challenge was undertaken by the CERN-SFT group in the past few years, resulting in a collaborative effort of 60 participants with different backgrounds. The infrastructure presented in this challenge has already been adopted by the LHCb simulation project, and therefore we can use the models prepared by the community in our simulation framework. Other experiments use the CaloChallenge for their benchmarks, see the CMS example.
A few years have now passed since this was tried out, and novel ML architectures appeared, including the CaloDiT model from CERN-SFT. First checks were made for the LHCb Run3 calorimeter, but the initial LHCb upgrade 2 calorimeter geometry is already implemented and soon it will be possible to test ML models with upgrade 2 simulation productions. Hence we propose a new challenge: CaloChallenge-v2. This would allow continued exploration of new models with the Gaussino setup, and if interesting based on a set of benchmarks (not all of them would be suitable) then a full blown trial can be done tied to the LHCb geometry. This CaloChallenge-v2 could also cover shower leakage, aka. Punch-through.
This project is already taking place in close collaboration with EP-SFT. The challenge can be organized with other experiments, including FCC.
Speaker: Michał Mazurek (National Centre for Nuclear Research (PL)) -
29
Detector geometry-aware model
Fast simulation of calorimeter showers has resulted in a surge of generative machine learning models showing excellent performance in mimicking Geant4. However, current models assume no information about the underlying geometry of the detector. The models are expected to learn the patterns in the geometry via information about the position of the incident particle or from the showers themselves. The project proposes the exploration of using geometrical information of the detector inside of models. This could be, for instance, a detector map in the form of an image segmentation to feed the geometry information to the model [2302.05543]. This will allow us to generate more accurate and potentially faster models (as the model won’t need to memorize geometry), and could potentially be used with zero-shot inference for new detectors. At the start of the project, various approaches to providing this geometry information should be explored.
Speaker: Peter McKeown (CERN) -
30
Data-driven Tuning of Fast Simulation Models
Generative machine learning models are becoming essential tools for fast simulation at collider experiments, providing a means to produce the vast amounts of simulated data required by physics programmes. These approaches are trained using full simulation input provided by Geant4, the state-of-the-art Monte Carlo simulation tool used throughout high energy physics. While Geant4 includes a broad and powerful suite of physics models, deviations with respect to experimental data must still be removed via dedicated corrections.
This project seeks to explore ways of reducing deviations between data and simulation in the context of ATLAS fast simulation. This would focus on improving the modelling of key physics observables using aspects of the experimental data as part of the model training. Successful completion of this project could provide a mechanism for fast simulation tools in general to improve beyond the capabilities of the full simulation they are trained on, potentially reducing the data-MC corrections required by analyses.
Speaker: Peter McKeown (CERN) -
31
Compression of generative models for fast simulation
Fast simulation of calorimeter showers with generative models has seen significant development in recent years, with many LHC experiments either having already deployed generative models for this purpose as part of their simulation work flows, or currently validating their models for production. At the core of fast simulation methods lies a balance between speed and accuracy, with a detailed evaluation being required for each application. Ultimately, a suitable model should be as fast as possible while delivering suitable physics performance.
This project would seek to explore approaches to model compression for models which are in the evaluation stage for use in production at LHC experiments, while maintaining sufficient physics performance. Compression techniques to be explored would include, but not be limited to, quantization, pruning and distillation. The techniques chosen would be tailored to the specific requirements of a given experiment’s simulation ecosystem. Potential synergies exist with Area 2. This project would help to increase the efficiency, and therefore sustainability, of the use of generative models for fast simulation at the LHC
Speaker: Peter McKeown (CERN) -
32
ML-based punch-through surrogate: Enhancing the realism of fast shower simulation
This project proposes a development of a machine learning model to simulate punch-through particles - secondaries from calorimeter showers that exit the detector and enter downstream systems. While current fast simulation methods model more and more accurately in-calorimeter activity, they focus on cascades and calorimeters, limiting realism for studies involving muon systems or late-developing showers.
The model will act as a complement to existing shower surrogates, using input features such as the incident particle, shower profile (barycentre, start, …), and energy in the final calorimeter layers. It will generate a variable-length list of particles with positions, momenta, and types, designed for seamless integration into Geant4.
This project improves the completeness of fast simulation workflows, and it could be paired up with any classical or ML parameterisation of showers.
Speaker: Anna Zaborowska (CERN) -
33
RICH ML Challenge for LHCb
The LHCb detector simulation processing time is dominated by the calorimeter simulation, however, a non-negligible part is spent for the simulation of optical photons in RICH detectors. Following the success of the CaloChallenge organized in close collaboration with the EP-SFT department, we would like to propose to organize a similar challenge targeting the use of ML-based optical simulation in RICH detectors.
Optical simulation sped up with ML algorithms is under investigation by other experiments, for example in the neutrino sector by JUNO. RICH detectors are also considered for the FCC. Although ML for RICH detectors may be heavily dependent on a specific geometrical layout, it is worth exploring through a RICH ML Challenge along similar lines of the CaloChallenge.
Speaker: Michał Mazurek (National Centre for Nuclear Research (PL)) -
34
Generative ML models for RICH detectors: Investigate the gain for Cherenkov detectors
This project aims to develop machine learning models to parametrise the simulation of RICH detector images principally for use in LHCb, but with potential applications to other CERN experiments relying on Cherenkov-based particle identification, e.g. NA62. The goal is to create fast generative models that produce 2D ring-like photon hit patterns based on input parameters such as particle type, momentum, and incident angle.
Such models can serve as surrogates for detailed photon transport simulations. This project must be carried out with strong ties to the experiments, and has strong interest from an external institute from LHCb. The initial work would focus on preparation of the realistic RICH datasets, preparation of the validation framework, and could potentially extend to the design and development of the conditional generative models, and their benchmarking using performance metrics.
Speaker: Anna Zaborowska (CERN) -
35
Self-Supervised Learning for Fast Detector Simulation via Generative Modeling
This project addresses the growing need for scalable and efficient detector simulation in HEP, leveraging self-supervised learning and generative modeling to enable fast, generalizable simulations. By learning from unlabeled data and exploiting the intrinsic structure of detector responses, the proposed approach aims to reduce simulation time while maintaining physical fidelity. The project focuses on developing a unified model capable of adapting to various detector configurations, thereby eliminating the need for retraining for each setup. Expected outcomes include enhanced generalization across experimental conditions and flexible reusable architectures.
Speaker: Dr Sofia Vallecorsa (CERN) -
36
FlashSim development and integration
Detailed event simulation at the LHC is taking a large fraction of computing budget. CMS developed an end-to-end ML based simulation that can speed up the time for production of analysis samples of several orders of magnitude with a limited loss of accuracy. As the CMS experiment is adopting a common analysis level format, the NANOAOD, for a larger number of analyses, such an event representation is used as the target of this ultra fast simulation that we call FlashSim. Generator level events, from PYTHIA or other generators, are directly translated into NANOAOD events at several hundred Hz rate with FlashSim. We show how training FlashSim on a limited number of full simulation events is sufficient to achieve very good accuracy on larger datasets for processes not seen at training time. Comparisons with full simulation samples in some simplified benchmark analysis are also shown. With this work, we aim at establishing a new paradigm for LHC collision simulation workflows, for offline and scouting datasets, in view of HL-LHC.
Speakers: Francesco Vaselli (Scuola Normale Superiore & INFN Pisa (IT)), Maurizio Pierini (CERN) -
37
Towards a common end-to-end flash simulation
The CERN-SFT group, in their summary paper, proposed that "a common end-to-end fast-simulation tool could be created across experiments to complement the GEANT library." Building on the experience gained by LHCb in developing its Flash Simulation framework, Lamarr, several key challenges have emerged in integrating machine learning (ML) algorithms into high-energy physics software stacks:
-
ML models are typically lightweight, but the event-level granularity of the Gaudi scheduler complicates batching particles across multiple events. This results in frequent model invocations and significant overhead when using dedicated runtimes.
-
Dedicated runtimes are optimized for multithreading, which may conflict with Gaudi’s own multithreading management.
Constructing ML pipelines—comprising preprocessing, inference, and postprocessing—requires C++ development, a skillset often distinct from that of ML engineers who typically work in Python.
To address these challenges, Lamarr adopted a pipeline description language based on XML. This enables the composition of in-process computing blocks, distributed as shared objects via CVMFS. These blocks are transpiled from Python to C using tools such as scikinC and keras2c. This strategy shares conceptual similarities with SOFIE, a framework developed by CERN-SFT and used by LHCb.
We propose a collaborative project to gather requirements and draft an implementation plan for a multi-experiment, multi-application ML deployment system. This system would target high-throughput computing (HTC) environments and multithreaded C++ applications.
Key considerations include:
- Intermediate Data Representation: Efficient in-memory formats for intermediate data between computing blocks that support batch processing and cross-language accessibility (e.g., C++ and Python). Apache Arrow Tables and ROOT RDataFrames serve as promising examples.
- Experiment Independence: Leveraging Lamarr’s architecture as a foundation for a generalized, experiment-agnostic framework.
Graph-Based Data Structures: Enabling the definition and execution of ML pipelines on heterogeneous graph data representing particles, vertices, and reconstructed physics objects.
We believe that Lamarr’s implementation offers a valuable starting point and could serve as a prototype for a broader, experiment-independent solution.
Speaker: Michał Mazurek (National Centre for Nuclear Research (PL)) -
-
38
End-to-end Fast Detector Simulation and Reconstruction
The standard Monte Carlo pipeline separates generation, detector simulation and reconstruction. This project advances an end-to-end generative approach that maps truth-level particles directly to reconstructed objects, reducing per-event runtime to ≪ 1 s by bypassing detailed detector simulation and algorithmic reconstruction. For correctly identified objects, the simulation of kinematic properties and efficiencies are essentially a solved problem with existing HEP tools; the emphasis here is on mis-identified particles and fakes that arise from rare combinations of input kinematics and unusual detector interactions. The model is conditioned on pile-up, detector conditions and trigger selections to preserve kinematic correlations. Confusion-aware generation (explicit modelling of mis-ID channels), imbalance-robust training and domain adaptation are used to capture rare effects, while simulation-based inference and uncertainty calibration propagate errors to analysis-level observables. An extensive validation programme benchmarks physics performance against the standard chain across key analyses and systematic scans. The goal is a very fast (≪ 1 s) simulation that remains sufficiently accurate for a wide range of LHC and future-collider studies—particularly searches and systematic-uncertainty evaluations that require large, independent Monte Carlo samples.
Speakers: Michael Duehrssen-Debling (CERN), Nedaa Alexandra Asbah (CERN)
-
20
-
12:45
Lunch Break
-
Cutting Edge AI for Offline Data Processing 500/1-001 - Main Auditorium
-
39
Dreaming at CERN
DREAMS (DaRk mattEr with AI and siMulationS) is a state-of-the-art platform that combines thousands of high-resolution cosmological hydrodynamic simulations with machine learning to probe the nature of dark matter while marginalizing over uncertain baryonic physics. These simulations are run on the Flatiron Institute’s CCA cluster, supported by the Simons Foundation, anchoring DREAMS within an international ecosystem of large-scale computational astrophysics. It includes simulation suites at multiple scales — cosmological boxes, Milky Way zoom-ins, and dwarf galaxies — and spans a range of dark matter models (CDM, WDM, ETHOS-like, atomic DM). By coupling emulators and neural networks to these simulation datasets, DREAMS provides a powerful framework to disentangle dark matter physics from complex astrophysical processes and to forecast observational signatures in current and future surveys.
The goals of the present proposal are:
- Extend DREAMS by incorporating additional dark matter models—such as atomic DM, ETHOS-based acoustic oscillatory scenarios, SIDM variants—and apply advanced neural techniques akin to NeHOD, which uses diffusion models and Transformers trained on DREAMS' WDM zoom-ins.
- Deploy and adapt Denario agents to automate the research workflow using DREAMS data—e.g., generating hypotheses, designing ML architectures, running analyses, synthesizing results, and generating draft publications.
The timeline of the proposal is as follows:
Year 1
Extension of DREAMS: Incorporate new dark matter models beyond those already implemented (e.g. self-interacting DM, atomic DM variants, ETHOS-style oscillatory models).
Data preparation: Generate the relevant simulation outputs within the DREAMS framework and construct training datasets for AI models.
Neural architectures: Begin systematic tests with established methods (CNNs, emulators) as baselines.
Denario integration (early stage): Configure AI agents to suggest architectures and training protocols, adapting them to DREAMS datasets.
Year 2
Push beyond baseline methods by experimenting with diffusion models, Transformers, and graph neural networks (similar in spirit to NeHOD).
Perform large-scale benchmarking across DREAMS suites and dark matter models.
Advance Denario workflows to design experiments and automate analysis pipelines.
Year 3
Comprehensive comparison of AI methods across dark matter models.
Deploy Denario in full workflow mode (data → analysis → figures → draft manuscripts).Expected Deliverables
-- New simulation analysis pipelines across extended dark matter models.
-- Published comparison of AI techniques (e.g. CNN vs Transformer vs diffusion-based emulators).
-- Fully Denario-generated draft(s) of research articles.Speaker: Andrea Caputo (CERN) -
40
AI for QCD theory
The QCD theory group currently has a minor involvement with machine learning methods in learning amplitudes and optimizing amplitude evaluation via sector decomposition; the larger community however has greater expertise in applying ML to amplitude learning, optimization of integration procedures, etc, and is also making first steps towards optimizing the solution of integration-by-parts relations with ML.
We believe that there is potential for these methods to benefit our work on computing high-precision predictions for collider observables, and the best way forward would be trough communication with existing experts.
To this end, we propose holding a topical workshop on AI for theory, with the (implied) goal of establishing collaboration with external experts on areas benefiting theory work. We would also require funds for inviting visiting scientists, and hosting seminars to continue this collaboration forward.
Speaker: Vitaly Magerya -
41
Charting the space of scattering amplitudes with neural optimizers
Recent advances in machine learning have given rise to a multitude of applications in physics, from jet tagging algorithms, to fast detector simulators or AI-driven symbolic regression and give us the perfect tool for tackling hard numerical problems for which classical algorithms are challenging to design. Is it well known that neural networks are universal function approximators and as such are ideal candidates for solving integro-differential equations.
The goal of this project is to apply these techniques to the study of relativistic scattering of particles. We combine the nonperturbative S-matrix bootstrap methods with the modern AI tools to study the fundamental aspects of particle scattering which is not accessible using conventional computational methods, such as lattice or Feynman diagrams.
This project builds on the unique expertise of the PI on the question of the bootstrap methods. After years of conceptual development of a new method [1,2] which constitutes the only proposal to produce fully consistent scattering amplitudes, it was realised that machine-learning methods are exactly suited to tackle this problem numerically [3,4,5].
The pilot studies [4,5] in particular show the applicability of machine-learning methods for the specific problem at hand. In some work in progress [5] with the same group, the PI managed to obtain the first strongly coupled scattering amplitudes ever, in some idealised scenario of certain simple scalar particles.
The general goal of the project would be to systematically apply the method to various theories, and focus on theories with experimental relevance such as QCD and the strong interactions, and also gravitationnal theories. It will bring new key insights on the 60 year old question of the Froissart bound saturation in scattering experiments, produce state of the art models for scattering amplitudes between scalar mesons in QCD and open a new avenue of research to study quantum gravity at high energies.
The funding would come at a particularly timely moment in the development: the method exists and crucially needs workpower to be applied to relevant cases.
[1] Scattering amplitudes from dispersive iterations of unitarity
P. Tourkine, A. Zhiboedov
[arXiv:2303.08839] JHEP 11 (2023) 005[2] Scattering from production in 2d
P. Tourkine, A. Zhiboedov
[arXiv:2101.05211] JHEP 07 (2021) 228[3] Reconstructing S-matrix Phases with Machine Learning,
A. Dersy, M. D. Schwartz, A. Zhiboedov
[arXiv:2308.09451] JHEP 05 (2024) 200[4] The S-matrix bootstrap with neural optimizers. Part I. Zero double discontinuity
Mehmet Asim Gumus, Damien Leflot, Piotr Tourkine, Alexander Zhiboedov,
JHEP 07 (2025) 210 [arXiv: 2412.09610][5] The S-matrix bootstrap with neural optimizers. Part II. Full problem with one subtraction
Mehmet Asim Gumus, Damien Leflot, Piotr Tourkine, Alexander Zhiboedov,
To be published (expected Oct. or Nov. 2025)Speaker: Alexander Zhiboedov (CERN) -
42
Machine learning trivializing flows for gauge theories
A longstanding problem in lattice QCD is critical slowing down by taking the continuum and infinite volume limit. One part, the critical slowing of solving the Dirac equation towards the continuum, is effectively solved with the introduction of very effective multi-level solvers. However, a universal solution for the other part, critical slowing down of the Markov Chain Monte Carlo (MCMC) simulation towards fine lattice spacings, is still unknown. The critical slowing down manifests itself via long autocorrelations times of the topological charge. The standard algorithm, the Hybrid Monte Carlo algorithm, can not move between different topological sectors. This, however, is essential to minimizing systematic effects in continuum extrapolations which requires generating ensembles of independent gauge configurations at very fine lattices.
A solution to overcome topological freezing in lattice QCD was originally proposed at CERN by Martin Luescher [1]. The basic idea is to trivialize the gauge fields via a flow equation by keeping essential physical information like correlations during the transformation. It turned out however, that the proposed expansion of the method quickly requires larger loops, which increases computation costs and thus is limited to relatively small flow time [2].
An idea to overcome the limitation is the application of normalizing flows to gauge theories. In 2D-U(1) this can overcome the sampling problem caused by topological freezing [3]. Using the gauge equivariant structure the method can be used as a proposal within a MCMC procedure, which makes the sampling procedure exact. First steps towards the application in SU(3) gauge theories are already done, however a major limitation is the scalability towards larger volumes. This can be in principle overcome by applying it within a localized area [4] or using the method within reweighting [5].
The application of generative models. which are equivariant under symmetries, are also applied to other fields in HEP and are in particular useful for speeding up computations like detector simulation [6]. Moreover the method has large potential to improve various applications in lattice computation, like the usage for contour deformations in order to mitigate signal to noise or sign problem in measurements [5,7].
We propose to support the research of generative models in lattice gauge theories at TH by a fellow. In particular by developing software solutions, i.e. an efficient open source package for SU(3) is currently missing, and by further understanding how symmetries like the cubic group H(4) can be included into the model application.
[1] Trivializing maps, the Wilson flow and the HMC algorithm, Martin Luscher, Commun.Math.Phys. 293 (2010), 899-919
[2] Learning trivializing gradient flows for lattice gauge theories, S. Bacchio et al., Phys.Rev.D 107 (2023) 5, L051504
[3] Equivariant flow-based sampling for lattice gauge theory G. Kanwar, et al., Phys.Rev.Lett. 125 (2020) 12, 121601
[4] Tackling critical slowing down using global correction steps with equivariant flows: the case of the Schwinger model, J. Finkenrath, e-Print: 2201.02216 [hep-lat]
[5] Applications of flow models to the generation of correlated lattice QCD ensembles, R. Abbott et al., Phys.Rev.D 109 (2024) 9, 094514
[6] Novel approach for computing gradients of physical observables, S. Bacchio, Phys.Rev.D 108 (2023) 9, L091508
[7] Machine learning and LHC event generation Anja Butter et al., SciPost Phys. 14 (2023) 4, 079, SciPost Phys. 14 (2023), 079Speaker: Jacob Friedrich Finkenrath (CERN) -
43
Building a TH group “Physics for AI”
Artificial Intelligence is emerging as a new paradigm of science having an impact on fundamental research. AI applications have the potential to be transformative and change the research process in High Energy Physics, i.e. from data processing to simulation, from theoretical exploration to the design and operation of detectors and accelerators. With the advent of large language models it became clear that AI models benefit from being “physics-aware” and respect symmetries by design. Applying its expertise on fundamental physics, TH CERN can take part in shaping this process by stimulating research of advanced algorithms and applications. Taking an active role in this process, however, requires funding from external resources, such as the RCS AI initiative.
We propose a new staff position, which combines and can act as an anker point for other TH RCS-AI submissions from the various groups, which includes support for fellows in the fields of lattice QCD, formal theory, heavy-ion physics and cosmology as well as the visitor and seminar program to strengthen AI competences within the QCD group.
Such a new established AI group will create a hub for strengthening interconnections within TH as well as between TH and groups at EP and IT. The AI-Staff will be embedded within the HEP theory community, which has an active research program on machine learning techniques. Possible research areas include:
- to enable high-dimensional parameter optimization, to tackle numerically challenging lattice computations, or to reach high precisions in loop integral calculations
- to accelerate simulations with generative models, surrogate modelling, and enhanced MC methods
- event reconstruction like reconstruction algorithms spanning from tracking to particle flow
- to solve differential equations and symbolic regressions, related to fundamental laws
- formal and fundamental understanding of AI techniques.
Such a group will foster AI competences within TH and stimulate contribution to physics aware-AI, with novel symmetry-aware architectures and learning procedures as well as AI-driven exploration of the various HEP frontiers.Speaker: Jacob Friedrich Finkenrath (CERN) -
44
Neural network quantum states for the CERN nuclear physics program
CERN continues to stand as a world-leading laboratory in nuclear research. At the forefront of this effort is the ISOLDE facility, renowned for its cutting-edge studies of nuclear structure, alongside the Large Hadron Collider (LHC) with its program on high-energy nuclear collisions devoted to the exploration of the quark-gluon plasma, which is planned to continue at least until the end of LHC Run 4 (~2032).
Over the past decade, the range of nuclear species employed in collider experiments has expanded significantly. Data from these collisions reveal that observables do not vary monotonically with the mass of the colliding isotopes, signaling intricate underlying physics rooted in the quantum many-body structure of their ground-state wave functions. These findings have gradually led to the realization that the areas of low-energy and high-energy nuclear physics are not as disconnected as previously thought [1]. For instance, fully exploiting the scientific potential of recent light-ion collisions at the LHC ([2]) requires a systematic understanding of nuclei spanning from 16-O to 208-Pb. This highlights the need for a cohesive framework in which high-energy collision simulations are both informed by, and consistent with, the properties of the underlying nuclear ground states. Such a framework would, for the first time, connect low- and high-energy experiments, enabling cross validations. Importantly, it will also provide a robust platform for uncertainty quantification in the properties of the quark-gluon plasma from HL-LHC data, allowing us at the same time to improve model building in nuclear structure.
Recent breakthroughs in the combination of neural networks with variational Monte Carlo methods for solving the nuclear quantum many-body problem [3,4] offer a uniquely powerful tool to drive this program forward. In contrast to more conventional quantum many-body methods, these neural-network quantum states enable a multi-scale description of atomic nuclei, including short-range correlations, alpha clustering, and collective modes. In addition, they provide accurate nuclear wavefunction “snapshots” in terms of quantum Monte Carlo configuration which can be directly fed into high-energy collision simulations. This approach allows precise modeling of the initial state of the colliding nuclei and, consequently, more reliable extraction of quark-gluon plasma properties. At the same time, neural network quantum states can be extended to exotic species and notably to hyper-nuclei [5] — the latest frontier in nuclear structure research and of key relevance for nuclear astrophysics, including neutron star physics — which are planned to be studied at ISOLDE.
The project we propose aims to initiate such a program by gathering at CERN-TH the relevant expertise on these subjects and by establishing the necessary computational infrastructure on CERN resources.
Bibliography:
[1] Imaging the initial condition of heavy-ion collisions and nuclear structure across the nuclide chart, Nucl.Sci.Tech. 35 (2024) 12, 220
[2] First-ever collisions of oxygen at the LHC https://home.cern/news/news/accelerators/first-ever-collisions-oxygen-lhc
[3] Variational Monte Carlo Calculations of A≤4 Nuclei with an Artificial Neural-Network Correlator Ansatz, Phys. Rev. Lett. 127, 022502 (2021)
[4] Distilling the Essential Elements of Nuclear Binding via Neural-Network Quantum States, Phys. Rev. Lett. 133, 142501 (2024)
[5] Hypernuclei with Neural Network Quantum States https://arxiv.org/pdf/2507.16994Speaker: Giuliano Giacalone -
45
ML based global fit for gravitational wave detection with LISA
The next generation of gravitational wave (GW) interferometers, in particular the Laser Interferometer Space Antenna (LISA) will revolutionize our ability to explore the Universe through Gws. However, there is a significant data analysis challenge that comes with this increased sensitivity. While current ground-based GW detectors are noise dominated with sparse signals due to merging compact objects, the next generation detectors will be signal dominated, with thousands of resolvable binary systems as
well as astrophysical (and possibly cosmological) stochastic gravitational wave backgrounds (SGWBs) within instrument reach. This calls for a paradigm change in the data analysis [1].Based on exploratory works with J. Alvey and M. Pieroni [2,3] the goal of this project is to build a global fit framework for LISA data analysis based on simulation based inference (SBI). The simulator is modular, containing different GW sources as well as the instrument response. The SBI framework is based on the Truncated Marginal Neural Network Estimation (TMNRE) algorithm. All code will be made publicly available at [https://github.com/peregrine-gw/saqqara].
The exploratory works [2,3] have included instrument noise models, stochastic backgrounds and sub-threshold transient sources, leveraging the key advantages of neural networks over more traditional approaches, in particular the fact that these methods are likelihood-free and the intrinsic marginalisation over nuisance parameters. The goal of this project is to extend this framework to a realistic setup, including all expected relevant GW sources for LISA, and to benchmark and test the pipelines in the LISA Data Challenges.
As members of the LISA consortium, M. Pieroni and V. Domcke will be able to follow up on the implementation of the project results in the data analysis pipelines. J. Alvey, an established young researcher on the field of machine learning techniques in astroparticle physics implementation, will play a key role in coordinating the project.
[1] Cornish, Crowder, Phys. Rev. D (72) 043005 (2005). Littenberg, Cornish, Phys. Rev. D (107) 063001 (2018)
[2] Alvey, Bhardwaj, Domcke, Pieroni, Weniger, Phys. Rev. D (109) 083008 (2024)
[3] Alvey, Bhardwaj, Domcke, Pieroni, Weniger, Phys. Rev. D (111) 102008 (2025)
Speaker: Valerie Domcke (CERN) -
46
Optimized and scalable statistical analysis and numerical optimization with autograd and ML frameworks
High performance maximum likelihood fitting and associated statistical analysis tools, initially based on Tensorflow 1 (combinetf) and used for precision measurements (CMS W helicity, W mass), now re-written with Tensorflow 2 (RABBIT) and being used for in-progress CMS alphaS measurement, etc. Older combinetf has also been used for some of the analyses in the FCC feasibility study
Speaker: Josh Bendavid (CERN) -
47
Computationally efficient and smooth systematic variations from experimental response distributions with normalizing flows
Systematic variations based on explicit variation of kinematic quantities (e.g. shifting jet pT for Jet Energy Scale variations) is computationally inefficient in analysis workflows and can introduce additional statistical fluctuations in predictions. In cases where the probability density of the corresponding response distribution is known, these variations can be replaced with event weights which are more convenient and efficient in analysis workflows and have better statistical properties. This technique was already successfully applied for both the muon scale and resolution uncertainties and the hadronic recoil uncertainties in the recent CMS W mass measurement using multivariate splines to parameterize the response distributions. Normalizing flows offer a general approach to parameterizing these response distributions in higher dimensions and/or with more conditionality with ML to facilitate this technique across a wide range of physics objects and associated uncertainties, with possible synergies with fast simulation methods such as those used for FlashSim.
Speaker: Josh Bendavid (CERN) -
48
AI-Ready C++ code for simulation-driven analysis in HEP and beyond
To get the most information out of the LHC dataset, a physics analysis has to be globally optimized from event generation to statistical inference.
Automatic Differentiation (AD) is one of the cornerstones of ML/AI. Applying AD to simulation and analysis codes is also very appealing for optimizing next-generation HEP analysis. For example, augmenting simulated data with gradients helps to extract more information from the simulation, which is very expensive in terms of compute power, especially in the HL-LHC era.
Another example of applying ML tools to HEP analysis is to learn analytical surrogate models for simulated data, for use in Simulation-Based Inference (SBI).
To maximize the potential of such approaches, we need to bridge the gap between traditional HEP research codes and modern ML techniques to build sustainable, transparent, and reproducible ML-driven physics analysis.Mainstream ML tools such as PyTorch, TensorFlow, and JAX are primarily developed for commercial use cases. These tools are often not well-suited for HEP applications: they assume Python-based workflows, are difficult to integrate with established C++ codebases, and are not compatible with long-term support models characterising HEP software. Moreover, the governance and development priorities of these tools rarely align with those of the research community and such priorities are difficult to influence.
Continued reliance on external, industry-driven ML tools introduces long-term risks. These include loss of technical control, limited flexibility to adapt tools for novel research directions, and potential incompatibility with established HEP software environments.
For example: we can already anticipate from following LLM development that the boundary between model training and deployment becomes more blurry. The most advanced models continue training themselves in production. As the aforementioned Python libraries are difficult to integrate into our C++ production software, we risk being unable to implement such advanced online/continuous learning techniques.CERN and the broader HEP community should invest in in-house software to mitigate these risks.
ROOT already hosts key technologies that bridge the gap between HEP codes and the Python AI ecosystem, and also a solid Automatic Differentiation engine:
* Cling: a just-in-time C++ interpreter, enabling dynamic execution and integration of compiled and interactive code.
* Clad: an AD engine for C++, already integrated with frameworks like RooFit and CMS Combine.
* Python bindings: enabling hybrid C++/Python workflows for increased flexibility.
This means there is no high remaining cost to turn ROOT into a framework for differential analyses that natively supports deep learning, and also integrates well with existing HEP codes written in C++.The technological cornerstones listed above have demonstrated initial success:
* Differentiation through complex, branching C++ code (e.g., RooFit, CMS Combine). [1]
* Early prototypes supporting autodiff in stochastic simulations, such as Geant4 (via HepEmShow). [2]
* Modernization of ROOT’s statistical tools for compatibility with ML surrogate models for SBI. [3]We propose targeted R&D to consolidate these capabilities into a coherent, ML-compatible infrastructure for simulation-based analysis and beyond. This includes:
- Enable AD for ROOT classes and functions that are relevant for differential analysis and implementing deep learning models in C++ (e.g. linear algebra, vector operations, LorenzVectors).
- Supporting stochastic models, random number generators, and uncertainty propagation in autodiff pipelines.
Extending AD support for C++ code bases beyond ROOT (e.g. Delphes for fast simulation and detector studies). - Build a complete differentiable physics analysis demonstrator from simulation to statistical analysis using ROOT plus an external simulator/reconstruction software.
Ultimately, this should lay the groundwork for future analyses that tightly integrate ML without relying on industry codes and that run in stable production environments, both online and offline.
[1] https://indico.cern.ch/event/1291157/contributions/5889615/attachments/2900877/5087038/roofit_ichep_2024.pdf
[2] Aehle, Max, et al. "Optimization using pathwise algorithmic derivatives of electromagnetic shower simulations." Computer Physics Communications 309 (2025): 109491.
[3] https://indico.cern.ch/event/1338689/contributions/6016195/attachments/2953278/5192036/RooFit%20CHEP%202024-2.pdfSpeaker: Jonas Rembser (CERN) -
49
Modeling of Machine Induced backgrounds for LHCb
Modelling of MIB (Machine Induced backgrounds) to provide particles originating from losses entering the LHCb cavern. Beam losses of primary concern are beam halo on tertiary collimators close to the IP and beam gas interaction in the tunnel within a few hundred meters from the IP. This requires the determination of the beam losses based on beam optics and vacuum conditions through dedicated software. It is then followed by a separate particle transport simulation where particles originating from the beam interactions in given elements/gas are transported from the location of the losses to the entrance of the LHCb cavern. This step requires the modeling of the Long Straight Sections of the accelerator (a few hundred meters) and are very CPU intensive. The results consist of sparse particle distributions over the cross section of a few square meters of an interface plane between the tunnel and the experimental cavern, covering a wide range of energies. These particles are finally sampled to provide them as ‘generated events’ to the Geant4-based LHCb Simulation framework, so that they can be treated as for any other available generator and their impact on the detector as well as the trigger and reconstruction can be evaluated. They can also be combined with collision events for a more complete evaluation. The amount of independent 'generator events’ for this source is limited by the size of the sample after transport. Integrated distributions in terms of occupancy and dose levels can be well represented, while effects on single-event quantities as for reconstruction are limited to the diversity of the input sample.
It would be very interesting to fill in the sparse sampling to increase the statistics of MIB and based on fully simulated losses, to be able to obtain particles entering the caverns with much higher statistics. ML diffusion models seem a good candidate in this context.
This could be of interest to other LHC experiments and also to the FCC project.
We are interested in this even if we don't have anyone working on this at the moment.Speaker: Michał Mazurek (National Centre for Nuclear Research (PL)) -
50
Machine induced background in detector simulation
This project proposes to study a replacement of the solution like the one implemented currently in LHCb, where beam-induced background (BIB) particles are sampled from very large FLUKA output files, with a modern machine learning–based generator. Instead of relying on repeated access to stored datasets, a trained generative model would learn the distributions of BIB particles and produce new samples on demand, conditioned on relevant LHC machine parameters. This would drastically reduce storage requirements and file input/output overhead while maintaining the physical accuracy.
LHCb is currently preparing for the next Run and updating this simulation. The concept of BIB simulation is also not particularly linked to a single experiment, and potential interface between FLUKA BIB and Geant4 signal simulation could be explored in a much wider context.Speaker: Peter McKeown (CERN)
-
39
-
15:30
-
Experimental Technologies 500/1-001 - Main Auditorium
-
51
Simplicial attention mechanism for physics objects
While the usual attention mechanism successfully introduced edge features allowing to compute efficiently the inter-connection between two elements, one could consider more-object connections via a simplex system, which would generalize the concept of attention to any higher dimension, allowing a “hyper-graph” like attention model; see, e.g., arXiv:2309.02138.
Speaker: Sebastian Wuchterl (CERN) -
52
Robust and Efficient Topological Deep Learning for Event Reconstruction
Event reconstruction at the HL-LHC requires combining hits into clusters and linking them with tracks to form higher-level objects. This process is inherently multi-step and local, which risks globally suboptimal results when pile-up is high or when showers overlap. Current machine learning methods, like graph neural networks and transformers, already exploit relational structures, with recent extensions toward hypergraphs (where one edge can link more than two nodes) and geometric-invariant models (which do not depend on the orientation or position of the data). However, event data contain more complex relationships.
Topological Deep Learning (TDL) generalizes these approaches by learning directly on the structure created by the input data. By creating a topological map, storing how curves and holes evolve in local to global scale, imprecisions in the datasets are ignored and only relevant global perspective data is taken into account.
Our project aims to use these maps, to correct suboptimal decisions in the reconstruction pipeline. This is done by spotting inconsistencies in the maps of multiple steps of the reconstructions process. By iteratively adapting the suboptimal decisions, a global optimal reconstruction should be reached.Speaker: Christine Zeh (Vienna University of Technology (AT)) -
53
Accelerating AI with in-memory computing devices
Energy efficiency, while lowering the barrier to incorporating emerging device technologies into muture generations of computing systems must achieve higher processing speed and energy efficiency to support rapidly growing workloads under strict environmental constraints. To address this, domain-specific hardware accelerators have gained traction, with in-memory computing (IMC) emerging as a promising paradigm. By co-locating memory and computation, IMC reduces costly data movement and significantly improves energy efficiency. Digital IMC implementations provide precision and compatibility with existing design flows, while analog IMC offers the potential for greater energy savings and scalability by performing operations such as multiplication and accumulation directly in the analog domain. These complementary strengths motivate a unified design framework that can explore both approaches. In this context, hls4ml, a high-level synthesis tool originally developed for mapping machine learning algorithms onto FPGA and ASIC accelerators, provides a natural platform to investigate hybrid integration. Extending hls4ml to support digital and analog IMC models would enable systematic evaluation of trade-offs in accuracy, performance, and machine learning accelerators. Such integration would open pathways toward next-generation, domain-specialized hardware capable of meeting computational demands sustainably at future colliders
Speaker: Maurizio Pierini (CERN) -
54
Tensor Networks for Particle Physics
Develop applications based on tensor networks for LHC tasks. As part of this effort, develop the tn4ml library, to train TNs with tools used for deep learning applications.
Speakers: Ms Ema Puljak (The Barcelona Institute of Science and Technology (BIST) (ES)), Maurizio Pierini (CERN) -
55
QML at LHCb
The advent of the High-Luminosity LHC presents unprecedented computational challenges for the LHCb experiment, pushing the limits of classical algorithms in areas such as real-time data filtering, complex track reconstruction, and multidimensional analysis. To address this we propose a dedicated initiative to expand upon LHCb’s pioneering application of Quantum Machine Learning (QML) for b-jet charge identification and pattern recognition. Our approach will focus on further developing novel QML solutions for HEP challenges such as particle tracking, anomaly detection, and complex optimization, aiming to unlock a "quantum advantage" and enhance physics discovery potential. This project will solidify LHCb and CERN's leadership role in developing applications for quantum computing in experimental physics which is an extremely fast developing field.
Speakers: Jacco Andreas De Vries (Nikhef National institute for subatomic physics (NL)), Dr Nicole Skidmore (University of Warwick) -
56
Hybrid Monte Carlo event generators
Monte Carlo (MC) event generators are indispensable in High-Energy Physics (HEP) for simulating scattering processes and sampling the multidimensional phase space according to the differential cross section dσ. Since dσ is not known analytically in full generality, event generators must determine both the local structure of the integrand and the global phase space distribution through adaptive numerical methods. This poses severe computational challenges, especially for high-multiplicity final states and higher-order perturbative corrections foreseen in the High-Luminosity collider era.
On the technological side, recent works have explored the use of Quantum Computing (QC) in HEP [Di Meglio et al., Quantum computing for high-energy physics: State of the art and challenges], and specifically for process and phase space integration. Hybrid quantum-classical algorithms have been proposed for sampling, optimization, and multidimensional integration [Quantum algorithms for multivariate Monte Carlo estimation, arXiv:2107.03410; Quantum integration of elementary particle processes, Phys. Lett. B 137228 (2022); Loop Feynman integration on a quantum computer, Phys. Rev. D 110, 074031 (2024); Quantum integration of decay rates at second order in perturbation theory, Quantum Sci. Technol. (2024), 10.1088/2058-9565/ada9c5; Unlocking Multi-Dimensional Integration with Quantum Adaptive Importance Sampling, arXiv:2506.19965; Quantum Chebyshev Probabilistic Models for Fragmentation Functions, arXiv:2503.16073].
Building on these preliminary results, we propose a systematic integration of MC, AI, and QC:
(i) QC-enhanced sampling for accelerating multidimensional integration;
(ii) AI-guided orchestration for adaptive resource allocation between classical and quantum backends (see the emerging field of distributed hybrid algorithms [Distributed Quantum Circuit Cutting for Hybrid Quantum-Classical High-Performance Computing, arXiv:2505.01184]);
(iii) hybrid implementations where AI controls the interface between deterministic and stochastic modules, and supports the choice of quantum models [Enhanced feature encoding and classification on distributed quantum hardware, Mach. Learn.: Sci. Technol., 10.1088/2632-2153/adb4bc].
This project aims to establish a scalable framework for AI-orchestrated hybrid event generation, with the long-term objective of improving both computational performance and physical fidelity in collider simulations. By combining numerical methods, quantum algorithms, and AI-driven control, this initiative seeks to advance the frontier of computational particle physics and enable precision predictions at future high-energy experiments.Speakers: Dr Michele Grossi (CERN), Dr Sofia Vallecorsa (CERN) -
57
Warm-starting Variational Quantum Algorithms via Pre-Training on Classically Simulable Structures
Variational quantum algorithms (VQAs) offer a promising approach for near-
term quantum devices but often suffer from trainability issues such as barren
plateaus. While certain VQAs can avoid these problems, they are typically
classically simulable and thus of limited quantum advantage. This project explores the use of pre-training as a warm-starting strategy for VQAs that are not
classically simulable. The key idea is to use auxiliary tasks inspired by quantum information theory (e.g., minimizing von Neumann entropy) to obtain an initialization that improves trainability. In some cases, this may involve partitioning the circuit into classically simulable and non-simulable subparts, but
more generally, the auxiliary task itself may render the model or part of it classically simulable by definition. After pre-training, the full quantum model will
be assembled and fine-tuned on the target dataset using quantum hardware.
The project will investigate the effectiveness of such pre-training strategies, the
design of suitable auxiliary tasks, and connections to existing warm-start tech-
niques.Speaker: Jogi Suda Neto (University of Alabama (US))
-
51
-
1
-
-
AI for metadata analysis 40/S2-A01 - Salle Anderson
-
58
Enhancing data quality control with AI in ALICE 40/S2-A01 - Salle Anderson
Data Quality Control (QC) in ALICE encompasses both the online Data Quality Monitoring (DQM) and the offline Quality Assurance (QA), running synchronously and asynchronously with the data taking.
The goal of this exploratory project is to enhance and expand the ALICE QC framework with Machine Learning and Artificial Intelligence techniques. Other LHC experiments have already started studying this topic and, to some extent, using AI in their DQM and QA. We wish to coordinate and collaborate with them to benefit from their experience while contributing significantly to the field.
In particular, we want
- to help identify abnormal conditions quicker, e.g. using autoencoders
or CNNs ; - to prevent these conditions from happening by detecting
early signs or changing conditions, e.g. using time series methods
such as RNNS ; - to support the shift crew in better understanding the
problems by providing context and explanations, e.g. using saliency
maps ; - to consider having an agentic feedback loop to help with
operations ; - to apply similar techniques offline where applicable ;
- to help classify reconstructed data offline for their use during
physics analysis ;
The expected results of the project and the associated development include, but are not limited to:
- a comprehensive overview of the use of ML/AI in the other 3 LHC experiments and beyond, as well as its current use in ALICE in other fields,
- the identification of the most promising areas where to apply these techniques in QC,
- the collection of users' needs and requirements, in particular the Run Coordination and the physicists,
- the development of state-of-the-art solutions, as well as novel ideas and techniques.
- the integration of these solutions into the current system,
- the demonstration that these solutions function as expected using data collected during Run 3 (2022-2026) and in the offline reconstruction, in view of using them for Run 4.
The data to be used to train the algorithms already exists in the form of the QC data (histograms and graphs), their associated quality, the system monitoring data, the runs quality and labels and the Bookkeeping data.
Speaker: Mr Barthelemy Von Haller (CERN) - to help identify abnormal conditions quicker, e.g. using autoencoders
-
59
Automated data quality monitoring production and commissioning 40/S2-A01 - Salle Anderson
Support for the production and commissioning of HL-LHC detectors. Provide tools to reduce the amount of time needed to check correct functioning by having machine learned algorithms parse the data, building on the successful experience with anomaly detection for HGCAL sensors. Use for raw data, but also for configuration data, at different levels of integration and for different steps in trimming the front-end electronics. Example application to HGCAL where there are O(1e5) ASICs of three types whose configuration data and raw data need to be processed to establish whether or not the detector components at different stages of integration (from single module, to cassettes, to multiple cassettes, to sectors, to full endcaps) are working properly or not. Potential architectures to be employed include VAEs and other solutions proposed for anomaly detection.
Speaker: Pedro Vieira De Castro Ferreira Da Silva (CERN) -
60
Intelligent Observability: Applying AI Techniques to the Telemetry and Operations of the LHC computing facilities 40/S2-A01 - Salle Anderson
Computing infrastructures for LHC experiments, including their online high-level-trigger (HLT) farms and offline reconstruction facilities, generate a massive volume of complex telemetry, with logs and metadata often exceeding 100 TB per day.
Managing these distributed systems at scale reveals already the limitations of the traditional monitoring. Static, rule-based alerting is insufficient for diagnosing emergent, complex failure modes that arise from the intricate interactions between system components and the latency due to a failure in the system ranges from minutes to hours.
In view of the high-lumi data-taking at the LHC, with the ambitious physics programs driving the upgrade of these computing systems, these challenges are expected to manifest at even larger scale due to the increased required complexity.This motivates a paradigm shift from monitoring to observability, inferring a system’s internal state from its core telemetry outputs: metrics, logs, and traces. We propose building an AI-driven framework to automate the analysis of this high-dimensional data. This project aims to speed up incident triage, improve resource utilization, and transition system management from a reactive to a proactive discipline.
While focusing on delivering concrete tools immediately usable in production, the core of this effort will develop on the frontier of interesting R&D fields such as:Automated Anomaly Detection: Applying unsupervised learning on parsed and vectorised logs to establish dynamic performance baselines and detect subtle anomalies invisible to static rules.
Predictive Analytics: Using historical data to forecast resource demand and potential failures, enabling predictive maintenance and the dynamic evolution of alert rules.
AI-Augmented Root Cause Analysis: Correlating signals across disparate telemetry sources to automate root cause identification and providing operators with LLM-generated incident summaries and mitigation guides.
Our goal is to minimise the “mean time to detect” and “mean time to recover” by providing to the operators an enhanced and enriched overview of the state of the system and applications altogether. Key deliverables will include reference anomaly detection pipelines for HPC workloads, contributions to open-source log analysis tools, and operational dashboards integrating these AI-driven insights for active use in data-taking operations.
Speakers: Matteo Concas (CERN), Vasco Barroso (CERN) -
61
AI for experiment control 40/S2-A01 - Salle Anderson
The AE$\bar{g}$IS experiment at CERN's Antiproton Decelerator performs research using antiprotons and positrons, utilizing detectors and methods commonly used in atomic and nuclear physics experiments [1,2,3,4].
Designed for flexibility and scalability in the number of interconnected hardware components, the AE$\bar{g}$IS control system has been implemented and proven to work fully autonomously and unsupervised [5].
Through a novel online analysis framework, deployed within the CERN infrastructure, feedback on environmental, detector and control system data is obtained at run time. Using these feedback loops, the system has demonstrated the capability to perform autonomously high-dimensional optimization tasks [6].Through the integration of large sequence models (LSMs) into the various user interactions with the control system, the testing and execution of scripted experiments, as well as the analysis of the produced data, we identify the opportunity to reach an unprecedented level of automation in an antimatter experiment. The AE$\bar{g}$IS apparatus with its control system could provide an excellent environment for sophisticated physics-informed models to autonomously test and report on physics experiments. By that, the apparatus would provide a real-world benchmarking platform for model developers and a fertile ground for exploring the role of LSMs in physics research environments.
Towards this goal, the following development steps have to be achieved:- AI Hardware Integrator: Given specification sheets and prompt, codes the microservice to integrate a new device into the control system.
- AI scheduler: Given a user prompt, selects the right experiment, its parameters and method (e.g. scan or optimize).
- AI Safety Engineer: Checks an experiment for unsafe operations (e.g. vacuum pollution, wrong trigger sequence, over-voltage).
- Error Manager: Constant and triggered log analysis leading to error identification following a user notification and solution proposal.
- AI documenter: Given user logs, automatically tracks and records changes to the apparatus and updates the wiki.
- AI Run Manager: Monitors system logs as well as detector and environmental data to identify anomalies and notify users for not automated tasks (e.g. necessity to fill liquid helium, check on functionality of a detector).
- AI Analyst: Performs the analysis given a prompt, log or actually conducted experiment by utilizing the already implemented analysis infrastructure.
- AI Optimizer: Evaluates the performance of different optimization algorithms at run time and over already acquired data.
- AI Experiment Engineer: Given a defined experimental goal, scripts the experiment itself.
- AI Worker: Utilizes the above functionalities as a whole to achieve a user defined goal like performing a novel experiment or a detector optimization.
- AI Researcher: Utilizes the above functionalities as a whole to test its own hypothesis. Also useful to benchmark models.
The AE$\bar{g}$IS apparatus has enabled a wide range of research opportunities [7], but realizing AI-driven advancements will require more dedicated computing resources and personnel.
The development of digital twins, using Geant4, Fluka and CST Studio Suite, enables the production of large amounts of training data as well as the use of world models and simulations prior to experiments.
Given the size of the AE$\bar{g}$IS experiment as well as its operation throughout CERN's LS3, we expect to iterate through the necessary developments steps in three to five years, producing valuable insights on innovative control systems and their operation.[1] M. Berghold et al. (AE$\bar{g}$IS Collaboration), Real-time antiproton annihilation vertexing with sub-micrometer resolution, Science Advances 11, 14 (2025). DOI: 10.1126/sciadv.ads1176
[2] L. Glöggler et al. (AE$\bar{g}$IS Collaboration), Positronium Laser Cooling via the 13S-23P Transition with a Broadband Laser Pulse, Phys. Rev. Lett. 132, 083402 (2024). DOI: 10.1103/PhysRevLett.132.083402
[3] C. Amsler et al. (AE$\bar{g}$IS Collaboration), Pulsed production of antihydrogen, Comm.s Phys. 4, 19 (2021). DOI: 10.1038/s42005-020-00494-z
[4] L. Glöggler et al. (AE$\bar{g}$IS Collaboration), High-resolution MCP-TimePix3 imaging/timing detector for antimatter physics, Meas. Sci. Technol. 33, 115105 (2022).
DOI: 10.1088/1361-6501/ac8221[5] Volponi, M., Huck, S., Caravita, R. et al., CIRCUS: an autonomous control system for antimatter, atomic and quantum physics experiments, EPJ Quantum Technol. 11, 10 (2024). DOI: 10.1140/epjqt/s40507-024-00220-6
[6] Volponi M., Zielinski J., Rauschendorfer T. et al., TALOS (Total Automation of LabVIEW Operations for Science): A framework for autonomous control systems for complex experiments, Rev. Sci. Instrum. 95, 085116 (2024). DOI: 10.1063/5.0196806
[7] Caravita R. (AE$\bar{g}$IS Collaboration), Long-term outlook for the AEgIS experimental program, CERN CDS (2025). DOI: https://cds.cern.ch/record/2931062
Speaker: Tassilo Rauschendorfer (Politecnico di Milano (IT)) -
62
AI-Driven Anomaly Detection and Data Quality Strategies for the AMBER Experiment 40/S2-A01 - Salle Anderson
Efficient online monitoring of the data quality and the detector control system is essential for the smooth operation of any high-energy physics experiment. However, much of this responsibility still relies on manual shifter activity. To reduce workload and increase reliability, we explored artificial intelligence methods that automatically detect unusual patterns in monitoring plots and checklist data from the AMBER experiment. Machine learning techniques—including k-Nearest Neighbors, One-Class SVMs, and DBSCAN—were tested on data from the 2024 run period. All methods consistently identified anomalous behavior, demonstrating their usage in real-time flagging potential detector issues.
Building on these results, we plan to deploy an AI-assisted checklist analysis, where adaptive models learn the normal operating states of the experiment—for all subsystems—and highlight deviations, while also suggesting whether an anomaly is due to input error, transient fluctuations, or a possible hardware problem. This additional information shall provide shifters with clear warnings and hints to the underlying issues and create automated summary reports to ease data-quality assessment and debugging.
As a next step, deploying deep-learning models, such as autoencoders, for more robust anomaly detection and integrating AI into online filtering and detector alignment is intended.Speaker: Dr Thomas Poschl (CERN) -
63
AI Solutions for Infrastructure Management and Facility Operations Support 40/S2-A01 - Salle Anderson
The IRRAD Proton Facility, located on T8 beamline at CERN PS East Hall, is an experimental infrastructure. It hosts irradiation tests and experiments of various nature, focusing especially on the ones dedicated to the development of equipment and electronics for High-Energy Physics (HEP) experiments.
Over years of operation, multiple data-management systems were designed and implemented to operate the facility and monitor the key factors of the facility performance, also leading to multiple improvements over the years. These activities yielded a significant wealth and variety of data and metadata that was put to use in successful case studies [1-3]; yet, it is still awaiting the exploration that will use their full potential.
In the framework of CERN RCS AI Strategy, we propose to study whether an AI agent able to fully explore the IRRAD-available data and improve its underlying tasks can be designed and implemented. For instance, a real-time assessment of the PS beam characteristics may improve accuracy and efficiency of the irradiation experiments. The long-term endeavour in this project will focus on the possibility to provide a physics- and AI-based “digital twin” for IRRAD. Such a virtual representation of the facility, with continuously updating parameters (based on the constantly incoming streams of data) would be an invaluable system for the prediction and optimisation of small- and medium-sized experiments (SME).
Relevant projects that may arise from such top-level needs may be presented as follows:
-
a tool for beam-characteristics on-line monitoring and assessment;
-
a digital twin to facilitate the facility operation (“predictive maintenance”) and the simulation and execution of irradiation experiments;
-
a scheduling tool dedicated to HEP experimental infrastructures, with possibility to scale from SME up to large machine-level sizes.
Additionally, the somewhat small size of SME allows such new tools to be almost immediately integrated into daily operation and thus would provide an environment that could produce preliminary evidence-based results that may be later scaled up for larger and more complex infrastructures and experimental facilities.
[1] Szumega J. & Ravotti F., ML-Based Classification and Evaluation of the Bean Profile Patterns – Euro-Labs report, https://web.infn.it/EURO-LABS/wp-content/uploads/2024/08/EURO-LABS_MS29_final.pdf
[2] J. Szumega et al., “Machine Learning for the Anomaly Detection and Characterization of the 24 GeV/c Proton Beam at CERN IRRAD Facility”, in Proc. IPAC’25, paper THPM110, pp. 2667-2870, 2025. doi: 10.18429/JACoW-IPAC25-THPM110
[3] B. Gkotse et al., “CERN Proton Irradiation Facility (IRRAD) Data Management, Control and Monitoring System Infrastructure for post-LS2 Experiments”, in Proc. 19th International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS'23), Cape Town, South Africa, Oct. 2023, pp. 762-766. doi:10.18429/JACoW-ICALEPCS2023-TUPDP093
Speaker: Jaroslaw Szumega (CERN EP-DT-DD, Mines ParisTech (FR)) -
-
64
Trusted AI Agent Infrastructure for Repository and Library Workflows at CERN 40/S2-A01 - Salle Anderson
This proposal extends the foundations laid by the AIRDEC project (IT-CA/SIS) to develop a robust AI agent infrastructure that ensures safe, reliable, and transparent use of AI across research repository and library services. Building on this foundation, the project will expand the AI support into critical areas of high cost and human effort, such as spam detection and mitigation, automated support handling, and large-scale metadata curation of e.g. CERN Library Catalog/CERN Repository/CERN Videos and Open Research Europe platform. These domains represent major opportunities for efficiency gains, cost savings, and quality improvements, while at the same time requiring rigorous safeguards to preserve trust in the underlying systems. By collaborating with multiple groups across CERN, the project will create a shared AI platform that reduces duplication of effort, strengthens institutional capacity, and accelerates adoption of AI-assisted workflows.
Speaker: IT-CA-IR/OSI -
65
Evaluation of AI Techniques for Data Centre Cooling Optimization and Resource Allocation in the LHCb Computing Infrastructure 40/S2-A01 - Salle Anderson
The LHCb data centre is a key element of the experiment’s Data Acquisition (DAQ) system, while also supporting other computing tasks when not dedicated to DAQ. This project investigates the application of Artificial Intelligence (AI) and Machine Learning (ML) techniques to further improve its efficiency and sustainability. It focuses on two main aspects: Cooling Optimization, evaluating AI techniques for controlling the cooling infrastructure and achieving reductions in PUE while maintaining performance and AI-Driven Resource Allocation, dynamically adjusting computing resources according to workload demand. In this way, computing nodes can be reassigned to other tasks when possible, maximizing resource utilization, and powered down only when no useful work is available, thereby reducing unnecessary energy consumption. Both aspects aim at reducing the environmental impact of the LHCb computing infrastructure while making the best possible use of all available computing resources.
Speaker: Pierfrancesco Cifra (CERN) -
66
Design of a Real-Time Anomaly Detection System for LHCb Operational Logs 40/S2-A01 - Salle Anderson
The operational control layer in the Experiment Control System (ECS) of the LHCb experiment is built on WinCC Open Architecture (OA), which generates large volumes of logs. Currently, operators and shifters examine these logs manually to identify system errors. This process is time-consuming, tedious, and requires expert knowledge.
Patterns in the logs are not easily discernible, making it difficult to identify events that lead to system failures. Machine learning can uncover such patterns, flag potential errors before they occur, and streamline error identification and communication to shifters.
We therefore propose a real-time anomaly detection system for LHCb operational logs that will support:
- Error identification and prediction
- Root-cause tracing
- Alarms and notifications to shifters
This system will reduce operator workload, improve reliability, and enable proactive responses to potential failures.
Speaker: Benedict Kamoni Njoki (University of Nairobi (KE)) -
67
AI-assisted Data Quality Assessment for LHCb 40/S2-A01 - Salle Anderson
Data Quality assessment - choosing which data are good for physics and which are not - is an ideal use case for anomaly detection algorithms.
For assessing LHCb data quality we need a tool that takes a data quality decision and justifies them to make the decisions traceable.
Currently histograms of observables ranging from low-level sub-detector quantities up to high-level physics quantities are monitored by (human) shifters. This task is cumbersome and prone to human error. A preliminary infrastructure was set up
for automated data quality assessment with a chi2-based method and advanced machine-learning-based tools. The missing step before a deployment in production can be envisaged is the training of an appropriate model and comparing its performance to the human-based assessment procedure.Speaker: Titus Mombächer (University of Cincinnati (US)) -
68
AI/DL assisted detector (parameter) optimization 40/S2-A01 - Salle Anderson
Optimization of detector design using AI/DL, either through exploiting fully differentiable programming models or through the help of intelligent agents has been a growing field of interest in the recent years. This was spearheaded by the MODE collaboration (https://mode-collaboration.github.io/) and other research groups (https://doi.org/10.3390/particles8020047). While full end-to-end design of experimental setups has shown to be extremely challenging due to the many (often non-differentiable) constraints, AI/DL can be a huge asset in the optimization of detector design parameters within certain boundaries.
In ATLAS, particularly, the potential replacement of radiation damaged components after Run-4 offers a great possibility to leverage AI/DL in order to find optimal design parameters that take multiple inputs into account. We propose to use the case study of the replacement of the two innermost pixel layers of the ATLAS ITk, anticipated to take place in the long shutdown before Run-5 of the HL-LHC campaign, to develop the tools and models that allow to explore novel technologies and design phase space. We plan to integrate this with a rapid feedback loop including novel reconstruction/regression models and potential physics performance of the different detector concepts. In a second step, we plan to generalize the developed workflows to a more heuristic approach of physics experiment design. We then try to target design aspects of the future FCC detector concepts, and while we plan to first focus on tracking detectors, but - if applicable - plan to extend the process beyond the tracking devices. Particularly drawing from the experience and strong involvement in the design, construction and operation of the ATLAS LAr Calorimeters, the CERN ATLAS Team is leading the R&D on noble-liquid ionization calorimetry for future colliders. The work is currently performed within ALLEGRO (https://allegro.web.cern.ch/), a general-purpose detector concept for the future FCC-ee accelerator, We anticipate to integrate reconstruction, computational and physics performance into the optimization process.
Speaker: Andreas Salzburger (CERN) -
69
AI-assisted optimization of a noble-liquid ionization calorimeter for FCC-ee 40/S2-A01 - Salle Anderson
Building on extensive experience with the ATLAS LAr Calorimeters, the CERN ATLAS Team is leading R&D on noble-liquid ionization calorimetry for future colliders. Within the ALLEGRO detector concept for FCC-ee [1], we are developing a liquid-argon electromagnetic calorimeter with lead absorbers and advanced multi-layer PCB electrodes. A central challenge is optimizing detector granularity: the design must support particle flow reconstruction, maintain EM energy resolution and particle identification, achieve high angular resolution (e.g., for diphoton mass and displaced photon reconstruction), and control the channel count (O(2M)), while mitigating cross-talk.
Traditional manual optimization of segmentation (longitudinal, lateral, radial) is slow and limited. End-to-end AI-based optimization, demonstrated by the Mode Collaboration [2] and others [3–5], offers a powerful alternative. By replacing costly particle-shower simulations with differentiable AI surrogates and combining them with ML-based reconstruction, geometry parameters can be tuned via gradient descent. We will apply this method to optimize the ALLEGRO ECAL and explore a hybrid approach using high-granularity Geant4 simulations to validate results.
The outcome will directly inform the design of a test-beam prototype by 2029 and define the ALLEGRO ECAL barrel geometry. The same techniques can subsequently guide optimization of the HCAL and vertex detector, ensuring that ALLEGRO is equipped with state-of-the-art calorimetry for FCC-ee physics goals.
[1] https://allegro.web.cern.ch/
[2] https://mode-collaboration.github.io/
[3] K. Schmidt, K.N. Kota, J. Kieseler et al., End-to-End Detector Optimization with Diffusion Models: A Case Study in Sampling Calorimeters. Particles 2025, 8, 47, https://doi.org/10.3390/particles8020047
[4] W. Chung, Differentiable Full Detector Simulation of a Projective Dual-Readout Crystal Electromagnetic Calorimeter with Longitudinal Segmentation and Precision Timing, EPJ Web Conf. 320 (2025), 00052, https://doi.org/10.1051/epjconf/202532000052
[5] S.R. Qasim, P. Owen and N. Serra, Physics instrument design with reinforcement learning. Mach.Learn.Sci.Tech. 6 (2025) 3, 035033, https://doi.org/10.1088/2632-2153/adf7ff
Speakers: Andreas Salzburger (CERN), Martin Aleksa (CERN), Nikiforos Nikiforou (CERN) -
70
Let the Data Shape the Model: Towards Accessible Machine Learning 40/S2-A01 - Salle AndersonSpeakers: Axel Naumann (CERN), Christine Zeh (Vienna University of Technology (AT)), Leonardo Beltrame (Politecnico di Milano (IT))
-
71
ML-Driven Orchestration of Heterogeneous Workloads on the Experiment HPC Facilities 40/S2-A01 - Salle Anderson
The current ALICE HPC farm operates with static resource partitioning between workloads. While this guarantees robustness, it also limits the ability to fully capitalize on the available capacity. Given the heterogeneous workloads from online (DAQ) to offline (async) processing, as well as cloud services (e.g. OpenStack, including AI/ML training), there is an opportunity to introduce machine learning–based orchestration. Such an approach would allow dynamically managing resources across tenants and maximizing the farm efficiency.
Preliminary observations indicate that adaptive orchestration could boost overall efficiency by up to 50%, translating directly into higher throughput, better energy utilization, and an enhanced return on infrastructure investment.
ML for job scheduling has been explored in cloud and HPC systems, but not in the unique hybrid DAQ/async/cloud setting of ALICE. For example in Kubernetes, reinforcement learning has been used for DAG scheduling, ML-aware batch scheduling is used in Kube-Knots, Kueue and Volcano. In Slurm, works have been done on queue-time prediction, failure prediction, and ML-guided backfilling.
We propose to develop a machine learning–driven multi-objective decision model that continuously adapts the allocation of the HPC farm across three classes of workloads:
- DAQ (online reconstruction): Scaling based on real-time data flow monitoring and LHC schedule forecasts, ensuring strict throughput constraints.
- Async (offline reconstruction): Scaling according to the backlog of jobs.
- Cloud (e.g. via CERN IT OpenStack): Opportunistic utilization of idle resources, including AI/ML training for internal and external research users.
The model consists of workload forecasting (predicting DAQ load, async backlog, idle capacity), anomaly detections (identifying failures or abnormal job behaviors) and energy-aware optimization (preemptions, PUE). The goal is to learn policies that minimize preemptions while maximizing utilization and energy efficiency, improving HPC farm operations and contributing back to the HPC/AI/Orchestration community.
Deliverables
- Demonstrator service with reference ML models and dashboards showing multi-tenant scaling in action.
- Kubernetes (or alternative) integration: operator for dynamic partitioning of DAQ, async, and cloud workloads.
- Integration with CERN IT OpenStack: exposing idle capacity for managed external use.
- Open-source contributions: ML scheduling models, Kubernetes/OTel exporters, and benchmark datasets (anonymized workload traces).
Speaker: Lubos Krcal (CERN)
-
58
-
10:10
Coffee Break
-
AI Infrastructure for Model Training 40/S2-A01 - Salle Anderson
-
72
Energy-Aware Training Strategies for Sustainable Generative Modeling in Detector Simulation
Training large-scale generative models for particle detector simulation is computationally demanding, contributing significantly to energy consumption. This project focuses on developing energy-efficient training strategies for generative models used in detector simulation. By integrating energy-aware optimization strategies, mixed-precision training and sustainability metrics, the project aims to reduce energy consumption while meeting the quality requirements for the synthetic datasets. The three-year plan includes profiling, optimization, benchmarking, and initial definition of best practices for sustainability.
Speaker: Dr Sofia Vallecorsa (CERN) -
73
A modular ML training framework for state-of-the-art HEP tools and analysis
Development of a cutting-edge Deep Learning framework for HEP objects and analysis tasks, automatising the tasks with optimized data structure, CPU overhead, and GPU usage. The functionalities include model and feature modularity, benchmarking, hyperparameter optimization, distributed running, optimized data structures, data loading, and inference optimization. One option as a baseline framework with the majority of the functionalities already implemented is “b-hive”, developed in the CMS BTV physics group.
Speaker: Sebastian Wuchterl (CERN) -
74
MLOps Infrastructure and End-to-End Workflows for Online LHCb Operations
This project focuses on establishing a dedicated MLOps environment tailored to the needs of the online operations of the LHCb experiment. Its goal is to enable the development, optimization, and deployment of machine learning models entirely within the LHCb technical network, using LHCb-managed resources and directly supporting online workflows.
The first phase of the project, focused on deploying and configuring Kubeflow on the LHCb infrastructure, has been started as part of a summer student project. The system is already functional and has been configured with LDAP authentication and GPU time-slicing, allowing multiple users to share GPU resources. Some additional work is still required to finalize integration with LHCb resource management and to prepare the system for production use.
Running our own instance of Kubeflow inside the LHCb network also enables access to specific Online resources, such as the ECS, which are essential for certain ML workflows and directly support LHCb operations.
Building on this foundation, the next steps will extend the infrastructure to support advanced MLOps capabilities:- Model Serving with KServe: Deploying trained ML models as scalable inference services within the LHCb environment.
- Hyperparameter Optimization with Katib: Automating the search for optimal model configurations to improve performance and efficiency.
- Workflow Automation with Kubeflow Pipelines: Enabling reproducible, end-to-end ML workflows that integrate data preprocessing, training, optimization, and deployment.
By providing these capabilities, the project will deliver a unified MLOps platform for online operations, enabling efficient prototyping, resource use, and reliable deployment of ML models within LHCb’s computing environment.
Speaker: Apostolos Karvelas (CERN) -
75
Distributed data-loading Pipelines with ROOT for large-scale ML Training
In the HL-LHC era, ever larger datasets for ML training are in sight. These will enable the training of increasingly complex models, but the sheer volume of data may exhaust the capabilities of the machines that run the training. The data might neither fit in RAM, nor might saving the data on fast storage be cost-effective.
In this project, ROOT and the existing CERN infrastructure such as EOS, SWAN or Openstack will be combined to form a data-loading cluster, allowing to scale the loading and filtering of training data across a large pool of CERN resources. The data could be prepared asynchronously on a multitude of hosts, partitioned into batches ready to be consumed by various ML frameworks, and be streamed via network as the training progresses.Speaker: Stephan Hageboeck (CERN) -
76
Zero-conversion reading of HEP data for training with common ML tools
Training ML models on High Energy Physics data currently requires either very expensive copies and conversion to some intermediate format or creation of custom I/O pipelines for the end user. ROOT provides a prototype system for ingestion of data in the common TTree format (which also supports the future RNTuple format) directly into the ML model. This requires zero conversion steps and is done via a single function call for the final user. This streamlined approach of ingesting data into ML models can be made generic and cross-experiment. Work is required towards bringing this prototype in production, testing it on distributed scenarios and with training involving GPUs.
Speaker: Dr Vincenzo Eduardo Padulano (CERN) -
77
Provisioning AI/ML tools for data scientists
AI/ML tools evolve quickly, new versions and new packages are constantly being created. Providing new and updated packages in a consistent manner and for a distributed environment takes dedicated effort to avoid scalability issues. The LCG software stacks provide a wide range of AI/ML and related packages via CVMFS such as tensorflow, torch, jax, CUDA, and ROOT. As part of the RCS/AI initiative this menu of packages could be increased. At the same time making sure all these different packages are working together must be ensured by increasing the amount of testing. As part of the nightly build system, AI/ML integration tests will be implemented. These will ensure that new and updated tools are functioning correctly, and that all packages in the software stack are consistent, for example contain the expected version of CUDA, drivers, and dependencies and compute capabilities. This will require extending the tests performed to training and inference in various software and hardware combinations to ensure the available product is workable for real life applications.
Speaker: Andre Sailer (CERN) -
78
A Filesystem View on AI Training Data in Object- and Cloud Storage
Modern AI training for complex neural networks demands low-latency access to multi-petabyte datasets, versioned software stacks, and reproducible environments, mirroring challenges traditionally addressed by CVMFS in scientific domains. While at its core a software distribution tool, CVMFS can provide a general filesystem view on external data in object stores. This data-distribution over CVMFS offers the advantage of bandwidth savings via smart caching and provides reproducible environments, as well as the simplicity of POSIX filesystem semantics when accessing the datasets.
Speaker: Valentin Volkl (CERN) -
79
Scaling out AI/ML workloads to external resources
As AI/ML usage and use cases grow at CERN, training at scale as well as testing, benchmarking and validation on newer generation devices requires access to resources not currently available on-premises.
This activity involves setting up the required integrations in the CERN MLOps infrastructure to accommodate these requirements as seamlessly as possible. The work considers integration with both public cloud providers for on-demand access as well as HPC infrastructures, in particular the new AI factories.
Speakers: Raulian-Ionut Chiorescu, Ricardo Rocha (CERN) -
80
Data Preparation for Machine Learning Event Reconstruction
Event reconstruction is key to unlocking the full physics potential of the Future Circular Collider (FCC). Particle Flow (PF) techniques, which combine information from different subdetectors, rely on precise and well-understood inputs. Classical approaches often use hand-crafted features and detector-specific preprocessing, but machine learning (ML) methods require a different level of preparation. In particular, if the data are not carefully understood and formatted, ML models risk learning from hidden biases in the simulation rather than the underlying physics.
This activity focuses on studying how detector-level information should be prepared for ML training, with the goal of ensuring unbiased and physics-motivated inputs. This includes understanding the impact of truth-label matching, noise handling, and feature choices, and comparing these with the needs of classical reconstruction methods. By doing so, we aim to provide datasets that can be used directly in ongoing FCC ML efforts such as machine-learning particle flow (MLPF) and ML-based tracking, while also documenting the assumptions that go into the preparation.
End-to-end solution
Establish clear strategies for preparing data so that ML models are not driven by artefacts of the simulation.
Compare ML-oriented preparation with classical approaches to highlight key differences.
Provide prepared event samples that can be directly used in existing FCC ML studies.
Test these samples within the MLPF framework to check consistency and performance.Speaker: Lena Maria Herrmann -
81
Network infrastructure to support on-premises AI/ML workloads
While the infrastructure supporting AI/ML can be in the cloud, or use the existing HPC resources; this proposal considers the need to support on-premises AI/ML workloads with stringent requirements of performance, bandwidth, latency and lossless communication over Ethernet.
If CERN/RCS strategy for IA includes the support of high-performance resources in CERN's Datacentres for AI/ML workloads, this proposal will need to be carried out.
The work includes CERN IT to work with AI deployment experts in RCS/CERN to gather the specific requirements for the years to come.
Speakers: David Gutierrez Rueda (CERN), Eric Grancher (CERN) -
82
itwinai: Scalable AI Training and Optimization on HPC for Science
This proposal focuses on the further development and adoption of the itwinai framework, designed to help scientists scale their AI workloads on HPC and cloud systems while minimizing engineering overhead. itwinai provides high-level, reproducible workflows for distributed machine learning training and hyperparameter optimization using tools such as PyTorch DDP, DeepSpeed, Horovod, and Ray Tune. A key feature of the framework is its scalability analysis module, which helps users identify performance bottlenecks, quantify GPU utilization, and benchmark energy consumption across diverse hardware platforms.
Originally developed in the context of the interTwin project and extended through ODISSEE and RI-SCALE, itwinai has already been applied to fast Monte Carlo simulation use cases using generative AI (3DGAN and CaloINN), and was recently piloted with the CMS MLPF model for particle-flow reconstruction, with results presented at the last CERN openlab technical workshop. The framework has been successfully deployed on major EuroHPC centers (Juelich, Vega, LUMI, Deucalion) and is also compatible with cloud deployments. Further integration with the CERN openlab Heterogeneous Architecture Testbed (HAT) is a possible direction for future work. For CERN-wide adoption, itwinai is available through CVMFS and supports integration with Kubeflow and MLflow for MLOps.
The main goal of this project is to accelerate the adoption of HPC and cloud resources for AI within the HEP community at CERN, lowering the entry barrier for researchers and developers who lack HPC expertise. By abstracting system-specific complexity, itwinai allows scientists to focus on model development and physics use cases, rather than infrastructure and distributed systems.
Speakers: Matteo Bunino (CERN), Dr Maria Girone (CERN)
-
72
-
Infrastructure for AI Deployment 40/S2-A01 - Salle Anderson
-
83
Infrastructure for deployment and management of AI Agents (Agentic AI)
The existing MLOps offering in CERN IT covers the requirements for data preparation, iterative development, training and inference. It enables integration with existing infrastructure via APIs or by directly embedding the models.
This activity will focus on an extension of these environments to enable the deployment and management of agents as well as the required components to standardize access to components supporting the Model Context Protocol (MCP). This includes exposing the required data (the resources), integration with external systems (via tools) and interfaces to clients (prompts).
Speaker: Ricardo Rocha (CERN) -
84
An experiment-agnostic bookkeeping system for trained inference models
A lot of attention and care is dedicated to code and calibrations used for official data processing campaigns of experiments, such as event generation, simulation, reconstruction, or derivation. The same level of care should be dedicated to trained ML models deployed as part of the aforementioned data processing steps. Such entities should be easily findable, documented, versioned, and reproducible: where is this model coming from? How was it trained? On what datasets? All these questions ought to be easily answered to mitigate the risk of jeopardising the data processing, for example being unable to properly re-train a model critical for analysis (e.g. tau ID).
We have the opportunity to provide an experiment-agnostic bookkeeping system for trained inference models, combining technologies such as code distribution, data management, and design of generic metadata catalogues.Speaker: Danilo Piparo (CERN) -
85
Evaluation of heterogeneous hardware solutions for AI deployment
This proposal outlines a structured R&D programme to develop a standardised test-bed infrastructure for evaluating heterogeneous hardware solutions targeting Machine Learning (ML) model deployment in High Energy Physics (HEP) Trigger and Data Acquisition (TDAQ) systems, and using it to evaluate available hardware acceleration options. The test-bed will support benchmarking of co-processors including Graphics Processing Unit (GPU)s, Field Programmable Gate Array (FPGA)s and more exotic accelerators (Analog-AI, AI-Engines, etc.) within containerised environments, abstracting platform-specific dependencies and ensuring reproducibility. A metadata-driven evaluation framework will be implemented to extract key performance indicators such as latency, power consumption, and physics impact. Integration with HEP experiment simulation and triggering pipelines will allow context-specific performance assessments. The outcomes will provide data-driven guidance for the design and deployment of future TDAQ systems for the High Luminosity - LHC (HL-LHC) and beyond.
Speakers: Ioannis Xiotidis (CERN), Thorsten Wengler (CERN) -
86
ML Operations (MLOps) for FPGA based trigger implementations
The growing demand for fast and reliable Machine Learning (ML) inference in hardware triggers of High Energy Physics (HEP) experiments introduces new challenges in terms of model development, deployment, and long-term sustainability. This proposal aims to develop a generic, CERNwide ML Operations (MLOps) framework that enables end-to-end support for ML model lifecycles targeting Field Programmable Gate Array (FPGA)s. From raw data generation in frameworks such as Athena and CMSSW to model deployment on in-detector firmware, the system will focus on improving traceability, accelerating iterations on detector conditions. The framework will be modular, open to multiple experiments and toolchains (e.g. hls4ml, FINN, Vitis-AI), and prepared for scaling High Luminosity - LHC (HL-LHC) and beyond.
Speakers: Ioannis Xiotidis (CERN), Thorsten Wengler (CERN) -
87
Model registry, versioning, traceability and reproducibility
Storage and versioning of models, especially when handling a large number (1k to 10k+) needs specialized services. Traceability of the published models back to their training executions and parameterization is essential to offer trust and reproducibility.
This initiative will build on the ongoing NGT effort of offering a centralized mlflow instance and extend it to the whole CERN community, as well automating record keeping. It builds on the existing CERN MLOps platform and tools, extending them with the additional metadata required and integrating tools such as DVC as appropriate for improved version control.
Speaker: Amine Lahouel (CERN) -
88
AI/ML benchmark suite for hardware procurement of accelerator devices
The current procurement process for accelerator devices is done targeting specific vendors and models, in contrast with the generic procurement of CPU hardware. This limits the possibilities for vendors to optimize their offers with different layouts.
This activity focuses on establishing a benchmark suite with reference AI/ML workloads that can be used to augment or as alternatives to the existing HEPScore.
Speaker: Hannes Jakob Hansen -
89
Kubeflow backed by CVMFS: Efficient ML Model distribution for the Grid
The infrastructure to deploy both training data and final models in a distributed computing environment like the WLCG is essential in order to make optimal use of ML/AI in offline computing. CVMFS is the de-facto standard to deploy software binaries, and could bring its advantages to ML operations, in particular with respect to software preservation.
As ML models used for inference are commonly stored in OCI registries CVMFS can make use of existing container tools to cache and distribute them, integrating with other platforms such as Kubeflow. This is therefore no re-invention of existing industry tools, but an enhancement of state-of-the-art tools.. However, since the access pattern of these model files differs from other software binaries, proxies and caches need to be tuned to work effectively for this use case. A central “model-registry.cern.ch” repository will be created as a service for the community, similar to unpacked.cern.ch, to make its use similarly accessible and transparent.
Speaker: Valentin Volkl (CERN) -
90
Efficient Integration of AI/ML workflows in HEP Frameworks (e.g., Key4hep)
Developing new AI/ML uses is important, but it is equally important to ensure they can be used in production environments once they mature. For HEP experiments this means they need to integrate seamlessly into the respective software frameworks. For this purpose the integration into the experiment datamodels and frameworks has to be developed. The key4hep software ecosystem uses the Gaudi framework and EDM4hep event data model, and is used by a wide variety of future collider projects, such as the FCC, it could serve as an example for integration of AI/ML workflows, and ensure that the physics reach of future projects can be estimated with state-of-the art tools by efficiently integrating up-to-date machine learning training and inference workflows.
Speaker: Andre Sailer (CERN) -
91
Streamlining integration of ML inference in typical HEP analysis workflows
With the proliferation of different ML strategies being employed in HEP analysis workflows, the question of ensuring smooth integration with existing analysis tools is of paramount importance. Many aspects may subtly hinder the user experience and potentially block analysis development: the Python/C++ integration of the framework, on-disk vs in-memory representation of the physics events (with the possibility of further manipulation and preprocessing before inference), generalisation of the analysis interfaces to accommodate different ML models and approaches. In particular for ROOT, the natural path to follow is ensuring that the RDataFrame engine is capable of accepting any type of ML model and run it on the available data, including changing the layout of physics events lazily, and reshaping the computation graph to offload work to the model only when necessary.
Speaker: Dr Vincenzo Eduardo Padulano (CERN) -
92
Optimization of Machine Learning Inference
With the upcoming HL-LHC phase, optimizing machine learning inference becomes a critical challenge for data processing at CERN. Efficient inference requires not only fast algorithms for the underlying operations within ML models but also careful use of heterogeneous computing architectures. This includes minimizing data transfers between CPUs, GPUs, and specialized accelerators, as well as leveraging parallelism and memory locality to reduce overhead. By combining algorithmic optimization with hardware-aware strategies, the aim would be to design scalable inference pipelines that can meet the stringent real-time and offline analysis requirements of HL-LHC experiments.
Speaker: Sanjiban Sengupta (CERN, The University of Manchester) -
93
Efficient Heterogeneous Machine Learning Inference
Current CMSSW workflows suffer from inefficient CPU-GPU data transfers when running machine learning models, leading to significant overhead. It can add up to several hundreds of milliseconds per event, which is a big issue, especially in real-time environments such as at trigger level. This reduces performance and scalability, making it harder to fully leverage ML in CMS operations.
Our project addresses this challenge by enabling models to directly access GPU-resident data without redundant copies. We will develop a user-friendly interface that integrates seamlessly with CMSSW’s Structure of Arrays (SoA) format, supports multiple ML model outputs, and scales across heterogeneous hardware backends through alpaka, allowing the inference to be executed on the device where data, which has been produced by previous heterogeneous algorithms, are located.Key Benefits
- Performance and scalability: Eliminates costly memory transfers, accelerating ML inference in both online and offline workflows.
Ease of use: Simplifies ML integration by providing a standardized interface.
- Future readiness: Supports flexible model deployment on diverse and evolving hardware.
- Strategic alignment: Strengthens CERN’s investment in heterogeneous frameworks, enabling efficient use of diverse hardware.Speaker: Lukasz Michalski (Wroclaw University of Science and Technology (PL)) -
94
Centralized inference infrastructure in Gaudi
Gaudi is a common software framework underlying event processing in multiple experiments like ATLAS, LHCb and FCC. In addition, the simulation framework Gaussino (used by LHCb and FCC) is another user of Gaudi. As machine learning becomes increasingly central to real-time data processing, simulation and physics analysis, integrating diverse ML software stacks into Gaudi in a sustainable and reproducible way is a key challenge. Multiple experiment-specific implementations already exist of interfaces to both external and internal inference libraries, but are either partially duplicative or cover different ground. Therefore the goal is to have a unified interface such that all experiments gain in the developments on this front and reduce the maintenance burden, taking into account the various heterogeneous computing setups across experiments.
References:
for ATLAS (https://indico.cern.ch/event/1565886/#7-ml-inference-in-atlas)
for LHCb (https://indico.cern.ch/event/1565886/#5-ml-inference-in-lhcb)Speakers: Maarten Van Veghel (Nikhef National institute for subatomic physics (NL)), Michał Mazurek (National Centre for Nuclear Research (PL)) -
95
Production-level ML training and deployment pipelines at LHCb
As more and more ML models get used in production, like in real-time data processing and simulation, having infrastructure for reliable and fast turnaround of model retraining and deploying is crucial. To this end, a centralized CI/CD infrastructure and model storage, within LHCb, needs to be developed further as current solutions don’t scale well. In addition, user friendliness needs to be taken into account by designing the infrastructure with interoperability between the different use cases in mind, from production-level real-time data processing to analysts working on n-tuples at the local level. Furthermore, the larger the models become, the bigger the training datasets will be. To make sure data access scales well the resource needs and potential subsequent solutions need to be identified.
Speakers: Maarten Van Veghel (Nikhef National institute for subatomic physics (NL)), Michał Mazurek (National Centre for Nuclear Research (PL)), Dr Nicole Skidmore (University of Warwick) -
96
Data Management for AI/DL workflows with Rucio
High energy physics at large undergoes - similar to other domains - a paradigm shift to ever increasing importance of AI/DL applications. The success of these sophisticated techniques will depend heavily on our ability to manage, access, and trace data in a manner that is both performant and trustworthy.
This document puts forward a proposal for a research and development programme to investigate how our existing data management system, principally Rucio, could be enhanced to meet these new demands, embedded in the research infrastructure of the HL-LHC experiments and beyond. The programme is structured around two key areas: The primary focus is an exploration of high-performance data access mechanisms tailored for DL workflows. A secondary, but equally important, focus is an investigation into the metadata and provenance extensions required to ensure our AI/DL models are reproducible, traceable, and understandable.
Area 1: High-Performance Data Access for DL Workflows
The iterative nature of training DL models places extreme demands on our I/O subsystems. The principal goal of this area is to investigate methods that could help ensure that our valuable GPU resources are kept as busy as possible, by minimising the time spent waiting for data.
Topic 1.1: An Integrated Data Access Library for DL Frameworks
This topic would explore the design and feasibility of a unified data access library. The aim would be to provide a simple, performant, and transparent bridge between data stored in Rucio and the popular DL frameworks used by our community, such as PyTorch, TVMA, or TensorFlow.
Investigate:
- A Generic Data-Loading Interface: The R&D would look into architectures for a core library, likely with C and Python bindings, that could be easily integrated into the data-loading pipelines of various DL frameworks. The goal would be to allow researchers to reference Rucio datasets in their training codes as if they were locally available.
- Integration with HL-LHC Security Models: A key aspect would be the seamless handling of authentication and authorisation, investigating how to best integrate the token-based AAI models that will be prevalent in the HL-LHC computing environment.Topic 1.2: Site-level Data Services for GPU Computing
This topic would explore the concept of a site-level / edge service designed to support the specific needs of GPU clusters where DL training occurs. The service would aim to make data available locally for the computation resources, including the GPUs, capitalising on the performance of modern storage hardware.
Investigate:
- A Data Orchestration Service: This R&D would evaluate potential designs for a lightweight service that could operate at the storage boundary in front of GPU clusters. A key area of research would be its potential to interface with job schedulers to gain advance knowledge of the data requirements for upcoming training jobs. Existing services for computational jobs, such as ARC-CE and Harvester, have already proven improvements for compute; this would bring equal benefits for the data layer.
- Interaction with Caching Technologies: The R&D would focus on integrating with and enhancing existing caching solutions. A primary candidate for study would be XCache. The research would explore how a Rucio-aware orchestration service could issue hints to a local cache, prompting it to "warm up" by pre-fetching the necessary datasets for a scheduled training job. This approach could significantly reduce the latency for the first training epoch and subsequent data accesses. Considering the widely distributed nature of our input data, the necessity of smart caching is crucial.Area 2: Metadata and Provenance for Trustworthy AI
The validity of any AI model is inextricably linked to the data on which it was trained and the process by which it was created. This research aims to explore the extensions needed to capture this information reliably, and make it available in an understandable way.
Topic 2.1: A FAIR Catalogue for DL Datasets
This topic would investigate how to build a richer, more queryable catalogue of the datasets used for DL, helping researchers to find, understand, and reuse valuable data resources.
Investigate:
- Schema-Driven Metadata for AI: The programme would explore models for associating structured metadata with datasets, tailored to AI/DL. This would include investigating how best to capture information such as feature definitions, data splits (training, validation, test), class labels, and data augmentation parameters.
- Enhanced Discovery Mechanisms: The research would evaluate how dedicated search technologies could be integrated to provide a powerful discovery tool, allowing researchers to pose complex questions about the available training data.Topic 2.2: A Rucio extension for DL Artefact Provenance
This topic would develop a flexible extension for the Rucio catalogue for capturing and querying the provenance of data artefacts required throughout the entire AI/DL workflows, from raw data to the final trained model.
Investigate:
- This work would naturally require close collaboration with the machine learning and data science communities within the CERN hosted experiments and other AI/DL use cases at CERN. Their expertise would be vital in defining a truly useful and practical provenance model.
- The R&D would investigate conceptual models for representing the entire AI/DL lifecycle as a connected graph of "AI/DL Artefacts". The goal is to establish the foundations for linking these artefacts together in a reliable way. This includes exploring methods to capture the relationships between the input datasets, the feature models, and their trained output results, as well their repeated use for reprocessing, derivations, and analyses.
- A successful outcome of this research would provide a more robust and auditable record, which would be an invaluable asset for debugging models, comparing results, and ensuring the long-term reproducibility of our AI-driven physics analyses.Speakers: Mario Lassnig (CERN), Martin Barisits (CERN)
-
83
-
13:00
Lunch Break
-
Optimal AI deployment for Online Data Processing 40/S2-A01 - Salle Anderson
-
97
Deployment of Machine Learning Algorithms into Hardware Trigger systems - aka MLOps
ML deployment in the CMS Level-1 Trigger follows a compute-intensive pipeline involving data acquisition (detector and simulation), preprocessing, training, firmware synthesis, and deployment to online (FPGA) and offline (CMSSW emulator) environments. This project aims to streamline this chain across heterogeneous compute platforms to enable frequent model updates, in particular during HL-LHC commissioning. During operations, models must also adapt to changing detector conditions (e.g. tracker degradation) to maintain performance.
Speaker: Maciej Mikolaj Glowacki (CERN) -
98
Improving fast inference solutions at LHCb
With the high demands on throughput of real-time data processing, in some cases, even existing fast ML inference libraries are not fast enough. Currently, hard-coded solutions native to the LHCb event model and algorithmic structure still win, both at GPU and CPU level. The goal is to develop solutions that have all these benefits but scale better and reduce the maintenance burden by for example using just-in-time compilation while still being able to seamlessly integrate in the event model, similar to the existing throughput-oriented (ThOr) functor infrastructure.
Speaker: Maarten Van Veghel (Nikhef National institute for subatomic physics (NL)) - 99
-
100
Edge AI for Feature Extraction and Data Preprocessing on FPGAs
Modern high-energy physics experiments generate large data rates, requiring fast and efficient online processing. Embedding machine–learning–based feature extraction directly in the front-end electronics of the detectors is a promising approach to reduce the amount of data to be transmitted. Field-Programmable Gate Arrays (FPGAs) offer a unique platform for such tasks due to their parallelism, reconfigurability, and low power consumption.
Within this project, we investigate Edge AI for feature extraction and data preprocessing directly on FPGAs. Using a workflow that combines knowledge distillation, pruning, and quantization, we compress deep neural networks into compact models that can run on SoC/FPGAs with minimal latency.
As a proof of concept, we implemented a multi-layer perceptron via hls4ml on an FPGA to perform pulse-shape discrimination for the COMPASS/AMBER ECAL2 calorimeter, achieving a sub-microsecond inference latency and competitive accuracy with minimal resource usage.
This study opens the door for further exploration of alternative network architectures, extended training datasets including a broader range of pulse types, and the feasibility of multiclass classification according to distinct signal morphologies. Such developments could expand the approach's applicability and inspire new strategies for integrating machine learning into high-rate data acquisition systems.Speaker: Dr Thomas Poschl (CERN) -
101
Evaluation and Integration of AMD AI Engines for low latency inference in HL-LHC Experiments
The project proposes to evaluate and integrate emerging AI-specific computing hardware, such as AMD Versal devices (often referred to as "AI Engines"), into real-time inference workflows relevant to HL-LHC experiments. With active interest and efforts already underway in experiments such as ATLAS and CMS, and ATLAS planning procurement of such hardware, this work seeks to establish a common infrastructure for benchmarking, integration, and deployment of Versal devices at CERN.
According to AMD, this architecture can deliver significantly higher compute performance and performance per watt compared to FPGAs (which have been traditionally used in LHC trigger systems), potentially offering greater flexibility for deploying medium- to large-scale AI models. With more on-chip memory, larger and more complex AI models can be supported more effortlessly while keeping latency low. Another notable advantage of this architecture is significantly faster design compilation times compared to traditional FPGAs, which will reduce the load on shared build servers and helps conserve valuable CPU resources within CERN's infrastructure. Moreover, the architecture appears to be future-proof, with AMD currently releasing updates on an annual cadence. Last, these devices support low-bit integer arithmetic (e.g., INT8 or INT4) and newer models provide hardware support for sparse operations. This makes them particularly well-suited for running compressed and optimized AI models, building on existing expertise in techniques such as quantization and pruning developed within the HEP community. Consequently, these technologies can offer low-latency, high-throughput capabilities that can improve the scalability of AI-driven analytics within the trigger. However, despite their hardware advantages, these platforms currently lack an automated toolchain for deployment. While FPGAs benefit from tools such as hls4ml that support the conversion of high-level ML models into low-latency firmware, no such automated flow exists yet for these AI Engines.
This project will develop and integrate a deployment toolchain adjusted for AMD AI Engines, enabling their use in low-latency inference workflows at the HL-LHC. Building on existing efforts such as hls4ml and drawing on common infrastructure like the HAT testbed, the work will focus on creating a reusable and scalable software-hardware framework that supports optimized AI models for this architecture. In collaboration with academic and industrial partners, the project will address current tooling gaps and reduce redundant efforts across experiments by providing a common solution for deploying AI models for AMD Versal platforms.
Speaker: Dimitrios Danopoulos (CERN) -
102
pQuant
Training library including compression techniques for NNs, such as heterogenous quantization, hyperparameter optimization, pruning, etc.
Speaker: Roope Oskari Niemi -
103
Automation of optimization / quantization methods in the CERN MLOps offering
Model inference often has to be restricted by the use of shared GPU resources or targeting specialized hardware like FPGAs or edge devices. Without optimization a single model can monopolise memory and compute and manual optimization can take weeks or months.
This activity will focus on the integration of automated optimization pipelines with pre-configured recipes targeting GPUs, FPGAs and other edge devices. It will focus on building on the existing libraries and tools already used by CERN communities and integrate them efficiently in the existing MLOps platform offering, including the tools for profiling, model performance and cost analysis and A/B testing.
Speaker: Amine Lahouel (CERN) -
104
AI/DL Long Lived Particles triggering in the Atlas Muon Spectrometer for Phase-2 HL-LHC
Over the past year, the ATLAS muon group has successfully incorporated machine learning (ML) techniques to improve the identification of hits in the ATLAS Muon Spectrometer (MS) originating from primary vertices, while effectively rejecting noise and background muons. These advancements are critical to address the challenges posed by the extreme operating conditions during Phase-2 of the High Luminosity Large Hadron Collider (HL-LHC), when event reconstruction will have to be carefully optimized to operate within the stringent latency constraints of the Event Filter, ensuring scalability to the upgraded trigger system running at an input rate of 1 MHz.
Searches for long-lived particles (LLPs) are among the most promising avenues for discovering yet unseen physics beyond the Standard Model at the HL-LHC. However, displaced signatures are notoriously difficult to identify as a result of their ability to evade standard object-reconstruction strategies. In particular, searches for LLPs with large proper lifetimes (cτ), such as low-mass Heavy Neutral Lepton (HNL) or Hidden Abelian Higgs Model (HAHM) dark photons, currently rely on fully reconstructed tracklets (segments) inside the ATLAS muon spectrometer as an input to displaced vertex reconstruction. Although this algorithm achieves position resolutions at the order of 100 millimeters and mass resolutions of a few GeV, it faces important limitations. In particular, it is inefficient in reconstructing highly collimated decay signatures, such as those expected from phenomenologically favored low-mass HNLs produced in W decays, as well as scenarios with similarly favored long lifetimes, where decays occur inside the muon spectrometer volume and prevent segment reconstruction.
To overcome these challenges, this project proposes the development of transformer-based models for the identification and triggering of LLP decay signatures directly from the measurements in the muon spectrometer (drift circle and strip), targeting large proper lifetimes currently inaccessible. Originally designed for natural language processing, transformer architectures are particularly well suited to capture long-range dependencies and complex correlations in sparse, high-dimensional data such as that produced in the MS. Our goal is to leverage these capabilities for the reconstruction of displaced muon vertices, with a particular focus on both trigger-level and offline reconstruction. The focus is on the new L0 global system of the upgraded ATLAS experiment, which will enable the deployment of the ML model to run directly in the first stage of the trigger. This advancement opens the possibility of identifying relevant events in real time, without relying on conventional first-level trigger signatures as required in the current system. Success in this area could enable the exploration of leptonic vertex-based LLP channels for the Phase-2 HL-LHC that are currently beyond the technical reach of ATLAS.
Speaker: Davide Di Croce (CERN) -
105
Online Track Seeding with Edge-Classification GNNs in High Occupancy Environments
The Kalman-Filter tracking algorithm has proven great success within the history of LHC experiments. While performing very well at moderate occupancies both for track finding and fitting, the combinatorics becomes prohibitive at high occupancies. This is particularly true at the initial phase of the tracking when the seed parameters are not constrained yet and brute force combinatorial seeding with full Kalman machinery becomes exceedingly expensive. To address this problem a cellular automaton algorithm is successfully used. However, scaling behaviour and seeding quality can be improved using modern machine learning techniques. Track seeding is an initial step for track reconstruction and closely links to current machine learning projects pursued in the ALICE collaboration. Graph neural networks have shown great success for problems related to tracking in various LHC experiments (e.g. https://arxiv.org/html/2504.04670v1). With the fully GPU based online reconstruction and the upgrade of the ALICE online computing farm for Run 4 and beyond, the experiment is well suited to operate machine learning algorithms in the online system. While a full track reconstruction using GNNs is potentially unfeasible (computational bounds), track seeding limits the application to a variable subset of data where it is needed. Improving the track seeding will not only benefit the tracking algorithm, but also the particle identification capabilities of the experiment by improving cluster overlaps and dE/dx estimations. In general, this approach is experiment agnostic and can also benefit the tracking detector design proposed for ALICE 3 and beyond.
Speaker: Christian Sonnabend (CERN, Heidelberg University (DE)) -
106
Distributed Reconstruction in FPGAs and ASICs for Trigger Systems
End-to-end optimization of trigger system architectures implementing distributed deep learning (DL) across all layers. Integrated within low-latency, high-speed Field Programmable Gate Arrays (FPGAs) and Application-Specific Integrated Circuits (ASICs), the goal is to use on-detector DL-based encoded data directly. Additionally, enable end-to-end reconstruction of physics objects - such as electrons, photons, and jets - directly from raw sensor data, eliminating the need for intermediate steps like clustering. A single, optimally distributed, DL algorithm is expected to outperform traditional multi-stage processing methods, while operating within existing hardware constraints. This unified approach simplifies system architecture and leverages the evolving capabilities of FPGAs and optimally matches them to ASIC capabilities.
The proposal directly enhances the potential for groundbreaking discoveries and the observation of rare processes already at the HL-LHC, exploiting the CMS HGCAL, with the FE ECON-T ASIC and BE Trigger Primitive Generator Stages 1 and 2 FPGAs.Goals:
- On-Detector DL-Based Data Utilization: Develop methods to directly use DL-encoded data from on- detector electronics in trigger decisions. Rather than pass raw hits or tower sums, the detector will output high-level features (via a small on-detector neural network in an ASIC).
- End-to-End High-Level Object Reconstruction: Enable reconstruction of physics objects (electrons, jets, etc.) end-to-end within the trigger, skipping traditional layer-by-layer clustering.
- Optimized Distributed DL for FPGA Triggers: Demonstrate a distributed deep learning pipeline where components of a DL model run on different hardware components (e.g., first layers on an on-detector ASIC, latter layers on following FPGAs).
- Demonstrate these developments with the conditions and electronics of the High Granularity Calorimeter, which will be integrated into the Phase 2 upgrade of the CMS experiment at the HL-LHC.Speaker: André David (CERN) -
107
Common-mode noise rejection for the Phase 2 CMS HGCAL
Development, characterization, and integration of ML models to reject common-mode noise in the CMS HGCAL for the HL-LHC.
Speaker: Arne Christoph Reimers (CERN) -
108
AI-Enabled Custom Chips for Front-End and Back-End Electronics in HEP Experiments
This proposal explores the feasibility and potential of custom AI chips for both front-end and back-end electronics in High Energy Physics (HEP) experiments. The objective is to investigate whether AI-enabled devices can provide efficient data reduction by compression, filtering, and reconstruction close to the detector, and to define the technological and research pathways toward AI-enabled readout systems.
The activity is structured along three complementary prongs:
- Frontend AI feasibility:
- Investigate feasibility of embedding AI-enabled logic blocks in
front-end ASICs - Benchmark commercial, open-source, and CERN-based IP
(e.g., CEVA SensPro2, Eyeriss, VTA, hls4ml, new CERN IP?) - Study whether a “general-purpose NN block” could become a standard frontend component (similar to lpGBT for interfaces)
- Backend AI demonstrator:
- Deliver a first custom CERN AI/ML processor demonstrator (FPGA/ASIC) targeting backend electronics, building on the conifer FPU project
- Develop a standalone processor with I/O compatible with existing backend protocols (e.g., CSP for CMS backend electronics).
- Demonstrate prototype reconstruction and filtering applications for the CMS Trigger using the device
- R&D questions:
- Architectural: can detector architectures integrate AI readout without fundamental redesign?
- Architectural: evaluate the merits of different data reduction schemes: compression, filtering, reconstruction/summarisation in different use cases (e.g. tracker, calorimeter, trigger)
- Technological: which AI technologies are most suitable given power, area, latency, and throughput/bandwidth constraints?
- Technological: assess neuromorphic architectures for low power AI-enabled readout
- Radiation hardness: explore rad-hardness of AI chips and AI algorithms; investigate synergies with ongoing EP R&D WP5 efforts and external Marie Curie networks
- AI for chip design: evaluate AI-assisted floorplanning and reliability-by-design strategies enhanced by AI
These activities will converge toward answering whether a standardized family of AI-enabled readout processors is feasible, and if so, what the roadmap toward tapeout and adoption should look like.
Speaker: Sioni Paris Summers (CERN) -
109
Anomaly Detection in the CMS Level 1 Trigger (Run 3 & Phase 2)
Training, deploying, operating, and analysing anomaly detection triggers in the CMS Level 1 Trigger. In Run 3 CMS is using the AXOL1TL and CICADA triggers to collect anomalous events towards unbiased new physics searches. For the Phase 2 Upgrade, anomaly detection will benefit from the higher fidelity information provided by performing PF and PUPPI reconstruction in the L1T. Research is getting started into the best framing of the anomaly detection problem as a learned embedding / representation. CMG members are active in all of the areas of development from ML methods to the offline analysis.
Speaker: Maciej Mikolaj Glowacki (CERN) -
110
LLM-based Reasoning tools for front-end and back-end processing algorithm design
Inspired by the success of hls4ml in translating high-level machine learning models into FPGA firmware, this initiative envisions training and adapting LLMs capable of producing hardware description language (HDL) code, such as Verilog or VHDL, directly from algorithmic or high-level specifications. While the initial motivation is to enable neural network deployment, the scope extends to a broader class of designs, including rule-based algorithms and other non-ML workloads. In contrast to hls4ml, which primarily targets FPGA implementations, this framework will support the generation of complete hardware projects suitable for both FPGA and ASIC design flows. Beyond code synthesis, the proposed system will incorporate capabilities for formal verification of hardware designs, allowing the LLM to reason about correctness in a manner analogous to mathematical proofs. Furthermore, it will explore predictive reasoning techniques to approximate the outcomes of computationally intensive chip verification processes, thus offering rapid feedback and guidance during design iterations. This proposal outlines a step toward integrating generative AI with electronic design automation (EDA), bridging algorithmic desc
Speaker: Maurizio Pierini (CERN) -
111
ML Anomaly detection Triggers with Continuous Flows and Vector Field-based algorithm
From the preliminary work of https://arxiv.org/abs/2508.11594 , we would like to deploy agnostic, anomaly detection triggers based on continuos flows, for the first time in an experiment. Additionally, we would like to investigate possible applications of other ML algorithm based on vector fields, such as Diffusion Models, for the task of anomaly detection in triggers, while innovating and improving flows and existing algorithms for the task.
Speakers: Francesco Vaselli (Scuola Normale Superiore & INFN Pisa (IT)), Maurizio Pierini (CERN) -
112
CMS: Anomaly detection and event classification with transformers for L1 Scouting on Versal AI engine
Anomaly detection plays a key role as a novel strategy for trigger systems and real-time data analysis. This project, funded by Oracle Corporation and CERN openlab, focuses on developing and deploying AI models on the latest generation of AMD FPGAs (Versal with "adaptive intelligence" AI engine accelerators), for the L1 scouting system of CMS at the HL-LHC. The goal of this project is to implement a transformer-like model, that can be used for particle or event-level, classification, or fast-event reconstruction, at the level of the L1 Scouting hardware readout boards. Leveraging AI engines for the compute-heavy transformer architecture, in combination with the relaxed latency constraints of the L1 scouting (in comparison to the L1 itself), allows us to attempt more advanced ML algorithms/models than have previously been feasible in FPGAs.
Speaker: Elias Leutgeb (CERN) -
113
Unsupervised Anomaly Detection for the Trigger
This proposal outlines a research programme to develop low-latency, unsupervised anomaly detection algorithms for deployment within the hardware trigger systems of modern High Energy Physics (HEP) experiments. Focusing on calorimeter data, the goal is to identify rare or previously unseen physics signatures that evade standard trigger logic, using methods such as Auto-Encoders (AE), normalising flows, and one-class classifiers. In addition, partially supervised anomaly detection strategies will be explored to incorporate limited labelled data when available, improving sensitivity to both known and unknown beyond-the-standard-model signatures. To ensure compatibility with existing trigger infrastructure, the research will pursue a trigger-aware anomaly detection framework, where the existing trigger menu is integrated into the model architecture or decision logic. Probabilistic methods, including Bayesian networks, will be investigated as a means to explicitly encode prior knowledge of the trigger system and to provide uncertainty aware decision-making. This will allow the anomaly detection layer to operate synergistically with established trigger algorithms while maintaining robustness against fluctuations in detector conditions. Models will be optimised for inference on resource-constrained, low-latency platforms such as Xilinx Versal Field Programmable Gate Array (FPGA), with quantisation and fixed-point implementation to meet stringent timing constraints. The project targets deployment within ATLAS Phase-II and extends applicability to future detectors such as those at the Future Circular Collider (FCC). The outcome is expected to enhance discovery potential by leveraging cutting-edge anomaly detection methods while preserving trigger efficiency and operational resilience under evolving experimental conditions.
Speakers: Ioannis Xiotidis (CERN), Markus Elsing (CERN), Thorsten Wengler (CERN) -
114
Super resolution models for offline-like reconstruction in the scouting stream
CMS is investing resources in the scouting stream but its use so far has been limited to a few applications, mostly with jets and muons. Scouting has more potential than that, particularly with its extension at L1, being investigated in Run 3 and to reach its best in HL-LHC. One of the limiting factor towards a broad use of scouting is the resolution loss online also due to resource constraints, with respect to the offline reconstruction. We propose to address this issue using super-resolution models, per-object (jet, leptons, etc) resolution and object identification, upscaling models similar to what is done in data analysis to unfold high-level features to generator-level view of the event. This naturally extends to the standard L1T and HLT streams and reconstruction.
Speaker: Sebastian Wuchterl (CERN) -
115
ML in the CMS Phase 2 Level 1 Trigger
Ongoing activities include, but are not limited to:
Jet tagging [CMS-DP-2025-032]
Electron reconstruction [CMS-DP-2023-047, CMS-DP-2024-098]
Vertex reconstruction (usage in correlator) [CMS-DP-2021-035, CMS-DP-2022-020]
ML for Puppi
Event selection (especially VBF, di-Higgs)
Soft tau reconstruction in the Level-1 Data Scouting systemSpeaker: Sioni Paris Summers (CERN) -
116
Hit-based flavour tagging applications at trigger level using AI/DL
Current flavour-tagging algorithms at the LHC rely on reconstructed tracks to capture the signatures of displaced heavy-flavour decays. This approach requires a full track reconstruction, which is computationally expensive and not available at the earliest trigger levels in ATLAS. In this work we explore the potential of hit-based b-tagging, i.e. exploiting raw hit patterns in silicon trackers and calorimeters for heavy-flavour discrimination, without explicit track reconstruction.
The first case we investigate stems from making two forward-looking assumptions. First, that future silicon detectors, beyond the ATLAS ITk upgrade, may feature significantly faster readout speeds, enabling tracker hit information to be accessible already at the earliest trigger stages. Second, that such detectors may provide per-hit timing information with sufficient precision to disentangle pileup and reduce combinatorial confusion. The possible replacement of the two innermost pixel layers, which will take place beyond LHC Run 4, is in the right time-scale for such type of upgrade and we want to investigate if current state-of-the-art b-jet AI/DL algorithms could provide enough discrimination power with only two tracker layers information before tracking is performed. This has a large potential to improve considerably higher-level trigger stages and have a big impact on the discovery of key HL-LHC physics benchmarks featuring heavy-flavour final states, such as double Higgs production.
This use-case also motivates the development of general ML models for hit-based b-tagging, capable of learning directly from raw detector information. Such approaches could recover the displaced-hit signatures of heavy-flavour decays, while bypassing the need for full track reconstruction, thus enabling early and efficient heavy-flavour triggers. More broadly, it opens the question of whether sufficiently expressive architectures can approximate or surpass traditional track-based b-tagging performance by leveraging the full richness of the hit-level data.
On the long-term perspective, this project has two key aspects: the portability to future collider experiments through the implementation in the Open Data Detector available through the A Common Tracking Software infrastructure, and the tight interplay with detector technologies, readout and design, maximizing the connections across various CERN groups.
Speakers: Lorenzo Santi (CERN), Markus Elsing (CERN) -
117
Enhancing Classical Trigger Algorithms with Machine Learning
This proposal outlines an R&D programme to integrate Machine Learning (ML) models into existing heuristic-based trigger and reconstruction algorithms in High Energy Physics (HEP) experiments. Using ATLAS Phase-II as an example and extending towards the Future Circular Collider (FCC) era, the goal is to demonstrate how ML can improve performance of traditional well understood algorithms in terms of resolution, efficiency, and robustness, while maintaining or reducing latency. The programme will investigate augmenting algorithms such as the Kalman Filter (KF) for track reconstruction and calorimeter-based clustering for jet triggering with ML-driven corrections or seeding. This work will be hardware-conscious, focusing on low-latency implementations in Field Programmable Gate Array (FPGA) and heterogeneous platforms, and validated with both simulation and real data. The outcomes will inform design decisions for the next generation of Trigger and Data Acquisition (TDAQ) systems.
Speakers: Ioannis Xiotidis (CERN), Thorsten Wengler (CERN) -
118
GPU (de)compression
In the HL-LHC era, we expect a significant increase in the amount of data and compression can serve as an effective tool to reduce storage requirements. As more and more computing facilities are equipped with GPUs, enabling lossless (de)compression directly on GPUs can reduce costly memory transfers to/from the GPU, remove reliance on the CPU for compression tasks, and increase the effective storage capacity of GPU memory. These advantages could greatly improve the performance in applications where the GPU data transfers or memory capacity become the bottleneck. However, current GPU compressors are either too slow or do not compress well enough for our purposes.
A compression algorithm essentially consists of a pipeline of encoders. We propose to train an ML model that can select the most optimal combination of GPU encoders, which builds a “new” compression algorithm that optimises the tradeoff between throughput and compression ratio for HEP data and HEP workflows. With this project, we aim to determine how much we benefit from GPU (de)compression in practice or whether existing encoding solutions are unsuitable for our needs, and we need to develop an in-house encoding solution tailored to our data
Further information can be found here: https://docs.google.com/document/d/1hU86jlNSCHE-pV8m-CNQ3dj_fo2RfpYOzJf6lsWxrA8/edit?tab=t.0
Speaker: Jolly Chen (CERN & University of Twente (NL)) - 119
-
97
-
16:00
Coffee Break
-
Large Language Models-based assistants 40/S2-A01 - Salle Anderson
-
120
GINO: a Grid-INtelligent Operator for Monte Carlo and Analysis
We propose an on-premises, agentic system built on open-weight models to automate repetitive tasks in grid-based analyses and MC production, cutting manual effort for submitting jobs for both operators and experts. The agent integrates with existing middleware (in the case of ALICE: Hyperloop, MonALISA, and jAliEn), using retrieval-augmented generation to interpret production and analyses requests and policies. A planner-executor loop coordinates end-to-end tasks of input preparation, job submission, monitoring, adaptive retry/backoff for transient site issues, output collection and artifact registration, while maintaining durable state (job registry, artifact tracker) and full provenance. Safety and governance are built in via role-based access, quotas, allowlists, dry-run/approval modes, change windows and clear escalation paths. Structured event logs supporting dashboards, alerts and post-mortems will further enhance the experience of the analysers and users who will be able to understand if and why issues arose during the submission of their analysis or MC workflows.
For the collaboration, this reduces operational load, shortens time from request to validated datasets and improves reproducibility. The agent enforces policy consistently (e.g. priorities, resource quotas), learns common failure modes to choose between resubmission, rerouting, or escalation and avoids orphaned or inconsistent productions.
Within the agentic-AI landscape, this system embodies a stateful tool-using planner rather than a one-shot assistant or brittle script. Emphasis is placed on reliable adapters and explicit safety rails, not model “cleverness.” Implementing open-weight models on our hardware ensures data residency, predictable cost and the ability to fine-tune on collaboration-specific schemas and log patterns - no external calls at inference time.
Speaker: Maximiliano Puccio (CERN) -
121
AI Chatbot for the ALICE O² Bookkeeping System
The ALICE O² Bookkeeping system provides essential log entries of activities in the experimental area, along with run, quality of run, environments, LHC Fills metadata, statistics, and quick-visualization plots. While powerful, the system requires frequent updates to its graphical user interface and demands familiarity with API queries or manual navigation, which can be a barrier for shifters and physicists. Some experts address this by writing scripts in languages such as Python to automate tasks, but these solutions are limited to a small group of technically skilled users.
This proposal introduces the development of an improved and more efficient tool that can serve each user, shifter, expert, etc. in a tailored manner. Thus, the proposal presents an AI-powered chatbot that can interpret natural-language queries, translate them into Bookkeeping API requests, and present results in accessible forms such as summaries, tables, or plots. Crucially, the chatbot will never have direct access to the database; all information will be retrieved exclusively through the existing HTTP API via the Bookkeeping service itself. Unlike approaches that combine LLMs with Retrieval-Augmented Generation (RAG) tools over vectorized database contents, the proposed design focuses on understanding the user’s query and having a controlled API orchestration, ensuring both usability, data security and transparency.
The project would act as a conversational assistant, enabling users to interact with the system naturally. By combining natural language understanding with controlled API access, the chatbot would simplify information retrieval and reduce overhead for both experts and non-experts. For example, the chatbot will parse user queries like “Show me all TPC runs last week with more than 95% good data as per the QC flags” and convert them into valid Bookkeeping HTTP API calls. For this to work, a translation layer will validate requests before execution, ensuring compliance with the API schema. Results will be processed into human-friendly outputs such as lists, summaries, or plots. At no stage will the AI model access or query the underlying database directly.
By using this approach, we ensure that answers are grounded in real data, reducing common issues encountered in other approaches such as hallucinations. This is possible, because we are not employing the LLM to generate data based on given content, but instead the LLM is only used to interpret a query from a user. This query would result then in an answer based on an API call on Bookkeeping stored data. Another benefit is the fact that data would not need to be anonymised as the model would never read or receive the content of the resulted query.Furthermore, as the ALICE O² Graphical User Interfaces (GUIs) have been designed in a component reusable manner (i.e. sharing a common server yet being independently deployed), the chatbot can then be integrated into the rest of the applications, improving the workflow and efficiency of all experts across the ALICE control room.
Speaker: Mr George Raduta (CERN) -
122
AI Assistant for ATLAS operations and beyond
The operation and maintenance of the experiments like ATLAS require expertise across many domains, particularly during interventions or unexpected events. While much knowledge is documented in CERN’s Engineering Data Management Service (EDMS), the system is fragmented, with limited metadata and diverse formats that hinder quick access. To address this, the Expert System tool (https://doi.org/10.1051/epjconf/201921405035) was developed to centralize and simplify access to the experiment’s knowledge base. It provides intuitive navigation, highlights interdependencies between subsystems, and simulates the interactions of approximately 13000 objects through 89000 relationships that are stored in the ATLAS TDAQ object-oriented configuration database, referred to as OKS (https://doi.org/10.1109/23.710971).
Any triggered DSS alarm requires operators to identify its cause, point of failure, and criticality, which depends on factors such as the affected subsystem and the experiments’ operational mode. Alarm recovery is resource-intensive and often involves multiple stakeholders, with the Shift Leader in Matters of Safety (SLIMOS) acting as first responder. Their main task is to determine whether an alarm stems from an error or from an intervention, as this dictates the follow-up. For example, a cooling plant shutdown may signal a system fault or scheduled maintenance. While true errors demand expert recovery procedures, intervention-related alarms can be resolved more easily but still consume resources and reduce overall alertness. To address this, an Alarm Helper tool was introduced in 2023 for LS2 following an alarm analysis that showed that most of the alarms were caused by interventions.
The HL-LHC upgrades will bring significant enhancements to the ATLAS detector and its infrastructure, particularly from the new Inner Tracker and the CO2 cooling system that will result in changes in established concepts, and a considerable number of interventions that might slow down the restart of operations. Since the LEP era, efforts to improve detector operations have combined automation with the accumulated expertise of operators. A natural evolution of this approach is the development of language-based AI assistants that focus on usability and explainability. Rather than replacing human decision-making, such tools would harness operator knowledge while lowering the barrier to accessing complex information.
This project aims to abide itself into a CERN-wide effort to transform how operators interact with the detector, ensuring that expertise is more widely accessible, decision-making remains human-driven, and operational efficiency and safety are enhanced as the HL-LHC era begins. The objective is to use an open-source Large-Language-Model (LLM) as the generative transformer, evaluate different inference engines for the infrastructure including commercial solutions and those developed at CERN, and use Retrieval-Augmented-Generation (RAG) to feed the documentation into the LLM. By building on the Expert System’s structured descriptions and graph-based algorithms, enriching them with the time trends and correlations captured by the Alarm Helper, and introducing the current status of the detector based on Detector Safety System (DSS) and Detector Control System (DCS) information, this assistant could be implemented in ATLAS in first instance, and provide intuitive, natural-language explanations of alarms and subsystem behaviour setting a new standard in detector operations. We aim to use the ATLAS expert system as a real life deployment target, however dedicated care during the development of the assistant model will be taken to allow the abstraction of feed-in data sources and model interconnects, in order to allow generalization of the developed technology to other use cases at CERN.
Speaker: Carlos Solans Sanchez (CERN) -
123
AI-assisted Coding Tools for LHCb
The code used in the LHCb data processing frameworks is becoming more and more complex to enhance flexibility while maintaining high performance. This leads to a steep learning curve for beginners and relatively few people with a complete overview over the experiment’s software, leading to a gap between the necessary coding skills to write performance code and the number of people willing to contribute.
Several commercial AI-assisted code generation tools are available, however they lack context-awareness of the LHCb software framework.
We propose to either develop a CERN-based tool, or to make context-aware commercial tools available for all LHCb users, thus allowing a larger community to write high-quality code for operating the experiment and analysing its data. This work could also be combined with the work proposed for the Analysis chatbot and / or the AI for training and documentationSpeaker: Titus Mombächer (University of Cincinnati (US)) -
124
AI for training and documentation at LHCb
LHCb has a vast and distributed volume of documentation, software, and operational knowledge, creating a significant barrier to entry for new researchers and a persistent challenge for information retrieval. To address this "knowledge-access gap", we propose the development of LHCb-GPT, a specialized LLM designed to serve as an intelligent assistant for LHCb researchers. We will develop a state-of-the-art foundation model on the comprehensive set of internal LHCb content, including the highly successful LHCb Starterkit, extensive Twikis, software repositories, and technical notes. The resulting tool will provide a conversational interface for complex queries, enabling rapid onboarding, efficient problem-solving, and, importantly, streamlined documentation creation with context. By transforming how researchers access and interact with the collective expertise gained over the last two decades, LHCb-GPT aims to enhance productivity and establish a new, scalable paradigm for knowledge management within large-scale scientific collaborations.
Speaker: Dr Nicole Skidmore (University of Warwick) -
125
LHCb Analysis Chatbot
Physics analysis at the LHCb experiment demands that researchers possess a dual expertise: deep knowledge of particle physics and mastery of a complex, ever-evolving software stack. This necessity creates a significant bottleneck, steepening the learning curve for new collaborators and diverting experienced physicists' time from discovery to software engineering. To address this challenge, we propose a "blue-sky" initiative to develop a conversational Analysis Chatbot, an AI-powered assistant capable of translating high-level physics objectives into executable code. By fine-tuning a LLM on the entire LHCb software ecosystem—from the core Gaudi framework to user analysis libraries and data-handling protocols—this tool will allow researchers to specify analysis strategies in natural language. The chatbot will then generate the required code, manage job submissions, and handle data retrieval, effectively abstracting away the underlying software complexity. This project aims to radically streamline the analysis workflow and reduce "time-to-physics".
Speaker: Dr Nicole Skidmore (University of Warwick) -
126
LLM for HEP co-pilot
Content: within the CMG group, there is an ongoing effort to explore commercial and open-weight LLMs for editorial work at CERN: We propose two specific tasks to help paper editing and proof reading:
identify wrong scientific statements, wrong constant values, etc. (leaving the correction to the user);
fix grammar in the text, making plots according to pre-defined style, etc.
Ultimately, this work aims to provide content- and style-driven tools for utilizing LLMs as an assistant in writing scientific documents, while also highlighting current limitations in their representation of advanced quantum field theory concepts that future model development should address.Speaker: Georgios Karathanasis (CERN) -
127
Proposal for a working group to develop AI for scientific writing at CERN
Publications are the primary vehicle for the transmission of scientific results. To convey these results in the clearest and most accurate way, researchers at CERN spend significant efforts on ensuring the quality of their prose. These efforts are particularly important for large experimental collaborations, where a handful of people in each publication committee are responsible for editing hundreds of articles per year to improve the writing of their colleagues (who are rarely native English speakers) and guarantee a uniform style among their papers. This not only consumes a lot of time, but the back-and-forth with the other authors about the form of the text is a distraction from more important discussions about the physics content.
Given the boom in capacity for computers to “understand” and edit text during the last few years thanks to the advent of Large Language Models (LLMs), it seems that style improvements of article drafts could to a large extent be automated, allowing publication committees to focus on higher-level tasks. Unfortunately, no off-the-shelf tool currently solves this problem satisfactorily. Free-to-use general purpose tools such as chatGPT pose confidentiality risks, dedicated commercial solutions such as Grammarly have problematic licensing terms for CERN’s use case.
As such, the best approach to obtain such a tool would be to develop it internally, building on existing openly available LLMs and adapting them to our needs by leveraging the deep Machine Learning expertise of the CERN community.Speaker: Micha Moskovic (CERN) -
128
Streamlining Technology Support of Distributed HEP Communities
The goal is to streamline support and reduce overheads across large communities using HEP technologies and navigating complex service environments (e.g., EOS, ROOT, ATLAS, CMS). By enabling natural-language assistants in the shared Discourse AI platform and context-aware queries, users will be able to query documentation and extract operational knowledge more intuitively, lowering the workload on experts and reducing support workload in general. This would improve accessibility of information and empower users to resolve issues independently. This activity should include evaluation which AI models — in-house or external — are the most appropriate for integration into CERN’s Discourse environments.
Speaker: Dr Maria Arsuaga Rios (CERN) -
129
Federated learning towards a common CERN chatbot
There is a growing interest in developing a dedicated CERN chatbot, based on a Large Language Model (LLM). A first example in this direction is the accGPT project. To make such a service functional to the experimental physics community, the chatbot capabilities should include experiment-specific functions, such as editing paper in the style of a certain collaboration, searching internal documentation, answering questions based on knowledge in twiki pages, discussion fora, write data analysis code, etc. To gain accuracy on these functionalities, the LLM model should be trained on private datasets, whose access is restricted to members of the specific collaboration. To make this possible without breaking the restricted data access protocol, one should envision a cross-experiment federated learning approach, leveraging the experience matured within the CAFEIN team on this front. We propose to create a working group with the CAFEIN team and representatives from the Machine Learning communities of the various CERN experiments to work on such a federated learning model for a CERN chatbot.
Speaker: Maurizio Pierini (CERN) -
130
A multi-agent system for LHCb computing operations
The LHCb experiment increasingly relies on volunteer shifters, physicists, and developers to operate and analyse data, yet existing tools (e.g., LHCbDIRAC, LHCbPR) are often unintuitive, poorly documented, and generate cryptic errors that demand expert intervention. This mismatch creates cognitive overload, reduces motivation, and hampers productivity across the collaboration.
We propose an agent‑assisted operational framework that leverages large language models (LLMs) equipped with retrieval‑augmented generation (RAG), tool‑calling agents, and a standardized Model‑Context Protocol. By integrating these AI components with LHCb’s software stack—through modular MCP (Model‑Control‑Protocol) servers, dedicated agents, and vector‑based knowledge bases—the system can:
1. Guide shifters through daily tasks, summarising nightly builds, LHCbPR results, and distributed‑computing dashboards, and presenting relevant documentation on demand.
2. Assist physicists in submitting analyses, diagnosing workflow failures, and extracting actionable insights from logs, e‑logs, and Mattermost discussions.
3. Support developers by surfacing recurring operational issues, thereby freeing them to focus on user‑experience improvements.The architecture is deliberately volunteer‑centric, open‑source, interoperable, and explainable, ensuring that AI recommendations are auditable and privacy‑preserving (local data processing where required). A modular design separates LHCb tools from the AI layer, allowing flexible deployment of remote inference endpoints (e.g., Hugging Face) or locally hosted models under CERN governance.
Preliminary integration with DiracX and LbAP MCP servers via Copilot Chat demonstrates feasibility, showing that LLM‑driven agents can invoke domain‑specific APIs without extensive retraining. Scaling this approach promises to reduce cognitive load, improve operational efficiency, and ultimately accelerate scientific output for LHCb—and, by extension, other CERN experiments—through smarter, AI‑augmented workflows.We believe the project can effectively kick-off when some preliminary work will be done already:
- A CERN-blessed LLM is identified (probably CERN hosted?)
- Necessary tools are deployed and maintained as-a-service (e.g. vector database)
- MCP servers for the common (CERN-IT-run) services (like gitlab, or mattermost, there might be several examples) are deployedThis is a common task with other projects within LHCb and CERN, and synergies/collaboration are foreseen.
Speaker: Alexandre Franck Boyer (CERN) -
131
AI-Powered development tools: adoption challenges and access framework
The rapid evolution and the growing demand for AI-powered development tools present both unprecedented opportunities and significant challenges for organizations. As AI technologies become integral to research, product development, and operational workflows, the need for a structured approach to manage these tools is essential.
This project explores the usage and governance of AI-powered development tools with focus on associated risks related to intellectual property and model training, aiming at providing a centrally approved access framework.Speaker: Ismael Posada Trobo (CERN) -
132
Catalog and infrastructure for LLM serving and fine-tuning
While core physics use cases currently rely on small models, other CERN use cases already ask for integration with pre-built LLMs with required fine tuning for different purposes.
This initiative builds on previous efforts in this area, such as AccGPT, to offer a centralized service with a catalog of pre-built models and the required integrations for selection, serving and fine-tuning using RAG and similar methods. It will look at tools already in use (vLLM) and research additional layers for the extra functionality - LiteLLM, LangFUSE, etc. It will also look at the non open source, pre-built vendor solutions from MistralAI and other providers.
Speaker: Ricardo Rocha (CERN) -
133
Systematic Assessment of Common LLMs wrt. HEP Analysis Codes
As LLMs are increasingly used to help researchers design and implement HEP analyses, it is essential to understand beyond anecdotal evidence their strengths and weaknesses with respect to common coding tasks in HEP. To this end, a set of several tens of typical ROOT-specific questions should be sampled, e.g. from the ROOT forum. The questions should span different categories (e.g., physics, I/O, fitting, plotting, trivia, etc.) and types (code example, code correction, clarification/explanation, etc.). The questions should be catalogued such that they can be adapted over time. An evaluation scheme should be developed before feeding the questions to common LLMs. The assessment should include correctness, up-to-dateness, clarity, and robustness. This benchmark should be used to compare popular LLMs amongst each other and to provide general guidelines of what works well and where are pitfalls in their application. A possible further angle of this project is in optimizing ROOT's publicly available resources (doxygen, tutorials, courses, etc.) for LLM ingestion.
Speaker: Danilo Piparo (CERN) -
134
LLM compression
Study techniques to compress LLMs. This could become relevant to deploy at CERN specific LLMs (e.g., chatbot) minimizing resources needed for inference
Speaker: Maurizio Pierini (CERN) -
135
Operational Intelligence for Computing Operations, enabled by Generative AI
Abstract
This proposal outlines a plan to relaunch the Operational Intelligence (OpInt) initiative, leveraging recent advances in Generative AI and AI Agent (AIA) technology to address the escalating complexity of distributed computing operations at CERN. While previous OpInt efforts demonstrated the value of data-driven insights, the landscape has now fundamentally shifted. AIAs offer a transformative opportunity to automate and simplify a wide range of operational tasks, moving beyond traditional analytics to create truly proactive and intelligent systems. The project will focus on two primary areas: enhancing distributed computing operations and improving on-site computing intelligence.
Background
The management of the Worldwide LHC Computing Grid (WLCG), a vast, heterogeneous, and distributed infrastructure, still requires significant manual intervention from developers, shifters, and site administrators. This is a substantial operational cost and a source of potential delays. Our past work on Operational Intelligence, documented in "Operational Intelligence for Distributed Computing Systems for Exascale Science" [1] and "Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence" [2], laid the groundwork by demonstrating the potential of machine learning to analyze operational data. However, the current evolution of AI, particularly the emergence of large language models and autonomous agents, presents a unique and timely opportunity to restart this work.
Proposed Solutions and Use Cases
We will focus on a phased approach, with initial efforts concentrated on the following high-impact use cases:
-
Automated Diagnostics for Data Transfers and Site Operations: Create AIAs to monitor system logs and data transfer services (e.g., FTS). These agents would:
- Predict Failures: Analyze log patterns from FTS transfers to predict potential failures before they occur.
- Streamline Troubleshooting: When an error is detected, the AIA would generate a preliminary diagnostic report, correlating the error with known issues (e.g., misconfigured batch nodes, network issues, recent software updates etc.), thereby simplifying the lives of shifters and site administrators. A generative AI could also help operators, pointing them to the correct direction and providing all relevant information from various sources in a user-friendly centralized environment.
-
GGUS Ticketing Intelligence: Develop an AIA to assist with the Global Grid User Support (GGUS) ticketing system. The agent would:
- Proactively Assist Users: When a user submits a ticket, the AIA would analyze the problem description and automatically suggest similar, previously resolved issues, helping users find solutions more quickly.
- Automate Diagnostics for Shifters/Admins: For new tickets, the AIA would analyze the ticket text and historical data, hinting at potential root causes. For example, it could correlate a specific error message with a past configuration change or an outage at a particular site.
- On-site Intelligence: Deploy AIAs at computing sites to correlate experiment-specific errors with underlying infrastructure issues. For example, an AIA could monitor job failures from a specific experiment at a given site and flag if a particular batch node is consistently misconfigured or broken. Another valuable example could be the anomaly detection of Storage (e.g. EOS) logs, this will enhance the precision of alarms, reduce false positives, and provide operators with actionable, early-warning insights into potential failures.
References
- Operational Intelligence for Distributed Computing Systems for Exascale Science — https://doi.org/10.1051/epjconf/202024503017
- Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence — https://doi.org/10.3389/fdata.2021.753409
Speakers: Alessandro Di Girolamo (CERN), Dr Maria Arsuaga Rios (CERN), Panos Paparrigopoulos (CERN) -
-
120
-
Training and Education 40/S2-A01 - Salle Anderson
-
136
STEAM Academy: a multi-year training backbone for Machine Learning at CERN
The STEAM Academy is the educational arm of the “Next-Generation Triggers” project. Between 2026 and 2028 it will deliver a rolling sequence of short, intensive courses that mix morning seminars and lectures with afternoon laboratories. The programme is organised around three inter-locking themes: Edge Computing for Trigger and DAQ, Modern Software Technologies, and Data Science & Machine Learning, so that every participant develops a coherent skill-set rather than a collection of isolated techniques. We propose that the Data Science & Machine Learning theme becomes the common reference point for all RCS AI-initiatives: it will cover the statistical foundations, the practical handling of columnar HEP data and ML frameworks, and the spectrum of modern models from GNN to transformers and foundation models, with a strong emphasis on deployment on GPUs and FPGAs. Lectures will be recorded and released under a permissive licence; the accompanying hands-on will run, unchanged, on GPU clusters and Versal test-beds hosted at CERN.
Speaker: Felice Pantaleo (CERN) -
137
Courses and best practices for ML development and operations (MLOps)
With the number of teams working on machine learning at CERN increasing, one of the top requests has been to develop a set of introductory and more advanced courses to the available ML platforms and tools.
This initiative should focus on developing the required content, engaging a large enough number of trainers and establishing a structure where these courses can be generally available for newcomers as well as on request by different teams at CERN.
Speakers: Raulian-Ionut Chiorescu, Ricardo Rocha (CERN) -
138
mPP tutorials
Between 2018 and 2023, a series of ML-related tutorials was organised as part of the mPP project. This included training on hls4ml, TensorFlow, Pytorch, Neuromorphic computing, Quantum Machine Learning, etc. The average attendance exceeded 100 people per event. We propose to restart this effort, though a funding program that would provide resources to invite speakers, provide access to cloud resources, etc.
Speaker: Maurizio Pierini (CERN) -
139
AI sprint at CERN
We propose an interdisciplinary applied ML sprint for CERN. During this eight week sprint, four person teams, Master and PhD, (2 domain experts + 2 ML specialist) work on a tightly defined CERN identified problem guided by two supervisors (one CERN domain expert and an ML external supervisor). The set of projects should be well defined, and should serve as a incubator for early-stage exploratory applications of ML to different problems at CERN.
Speaker: Dolores Garcia (CERN)
-
136
-
-
-
140
Speakers: Andreas Salzburger (CERN), Noemi Calace (CERN)
-
140
-
-
141
Speakers: Lorenzo Moneta (CERN), Maurizio Pierini (CERN), Dr Sofia Vallecorsa (CERN)
-
Cutting Edge AI for Offline Data Processing: Summary 500/1-001 - Main Auditorium
-
142
Global Event InterpretationSpeaker: Maurizio Pierini (CERN)
- 143
- 144
- 145
-
142
-
10:10
-
146
Optimal AI deployment for Online Data Processing 500/1-001 - Main AuditoriumSpeaker: Sioni Paris Summers (CERN)
-
147
AI for metadata analysis 500/1-001 - Main AuditoriumSpeaker: Maurizio Pierini (CERN)
-
148
Training and Education 500/1-001 - Main AuditoriumSpeaker: Felice Pantaleo (CERN)
-
149
Large Language Models-based assistants 500/1-001 - Main AuditoriumSpeaker: Maurizio Pierini (CERN)
-
150
-
151
Experimental Technologies 500/1-001 - Main AuditoriumSpeaker: Dr Sofia Vallecorsa (CERN)
-
152
Software and Hardware Infrastructure 500/1-001 - Main AuditoriumSpeaker: Ricardo Rocha (CERN)
-
153
-
141