Hamburg, Germany
23rd International Workshop on Advanced Computing and Analysis Techniques in Physics Research
The 23rd International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2025) will take place between Monday 8th and Friday, 12th September, 2025 at the University of Hamburg downtown campus. ACAT 2025 will be jointly organised by DESY and the University of Hamburg.
The 23rd edition of ACAT will — once again — bring together computational experts from a wide range of disciplines, including particle-, nuclear-, astro-, and accelerator-physics as well as high performance computing. Through this unique forum, we will explore the areas where these disciplines overlap with computer science, fostering the exchange of ideas related to cutting-edge computing, data-analysis, and theoretical-calculation technologies.
Transforming the Scientific Process: AI at the Heart of Theory, Experiment, and Computation in High-Energy and Nuclear Physics
The scientific process in high-energy and nuclear physics is undergoing a profound transformation, driven by the integration of artificial intelligence across all facets of research. On the theoretical front, AI is unlocking new ways to bridge the gap between experimental data and fundamental insights. By tackling complex inverse problems and enhancing predictive models, AI tools are empowering physicists to better map experimental results to theoretical parameters and accelerate joint experimental-theoretical analysis, leading to a deeper understanding of the universe's most fundamental forces.
In experiments, AI is pushing the boundaries of precision and sensitivity. Whether it's improving data reconstruction, refining object identification and classification, or advancing final calibrations, AI is revolutionizing how experiments are conducted and analyzed. The incorporation of AI-driven uncertainty quantification ensures more reliable results, while innovative workflows streamline processes, enabling faster and more accurate discoveries.
This transformation is underpinned by advancements in computational methods, where resource-aware AI models are rising to the challenge of operating in constrained environments like ASICs and FPGAs. AI-powered autonomous systems are enabling smarter control of experimental setups, from detectors to accelerators. Digital twins and robust co-design strategies are fostering trust in AI-based decision-making, paving the way for seamless integration of computational and experimental systems.
Together, these developments tell a story of a field redefined by AI—a cohesive interplay of theory, experimentation, and computation working in harmony to transform the scientific process for a new era of discovery.
Conference Picture
Link to high resolution version.
Abstract submission
Abstract submission is closed.
Registration
Registration is now closed.
Early student fee: €280.00Early regular fee: €420.00Student fee: €380.00Regular fee: €550.00
Early bird rates are applicable for all registrations completed by 31 July 2025 incl. payment.
Please follow the following steps for registering for ACAT:
1) Fill out the registration form on Indico.
2) Follow the link to the payment setup on Converia.
3) Your registration will be marked 'Complete' by us once payment has been processed.
Remote Participation
There will be no option to remotely present talks or posters at ACAT 2025. However we will provide remote listening to all plenary and parallel talks via Zoom on a best effort basis free of charge.
See https://indico.cern.ch/event/1585309/ for more information.
More Info
You can sign up for general announcement notifications from acat-info@cern.ch by sending an email to acat-loc2025@cern.ch! This list is low traffic and will only get you ACAT conference announcements and general information (for this and future conferences in the ACAT series).
Many people are working together to bring you this conference! The organization page has some details. David Britton is the chair of the International Advisory Committee. Jennifer Ngadiuba and Chiara Signorile-Signorile are the chair and co-chair of the Scientific Program Committee, respectively. Gregor Kasieczka is the chair of the Local Organizing Committee.
-
-
08:00
Registration Foyer (University Main Building)
Foyer
University Main Building
Edmund-Siemers-Allee 1 -
Plenary ESA A
ESA A
Conveners: chair: Chiara Signorile, co-chair: Jennifer Ngadiuba-
1
OpeningSpeaker: Gregor Kasieczka (Hamburg University (DE))
-
2
Advances in Model-Agnostic Searches for New Physics at the Large Hadron ColliderSpeaker: Mikael Kuusela (Carnegie Mellon University (US))
-
3
Computational Tools for Dark Matter in Particle Physics and AstrophysicsSpeakers: Genevieve Belanger, Genevieve Belanger
-
4
Transformers for scattering amplitudes computationSpeaker: LANCE Dixon
-
1
-
Poster session with coffee break: Group 1 ESA W 'West Wing'
ESA W 'West Wing'
-
5
A web-based job and data management system for the HERD experiment
The High Energy cosmic-Radiation Detection (HERD) facility is a space astronomy and particle astrophysics experiment planned to be installed on the China Space Station. HERD is a China-led mission with Italy leading key European contributions. Its primary scientific goals include detecting dark matter in cosmic space, precisely measuring the energy spectrum and composition of cosmic rays, and conducting all-sky observation of high-energy gamma rays.
To meet these scientific objectives, the experiment demands a vast amount of storage and computing resources. Moreover, extremely large simulated data sets are required to study the performance of the detector, and these data sets may be distributed across China or Europe. In response, we have developed a web-based job and data management system (DMS). This system enables scientists to submit job requests transparently via web pages. Administrators, on the other hand, can evaluate these requests in light of resource utilization and assign jobs to the most suitable sites. This contribution will provide a comprehensive overview of the design and implementation details of DMS.
In the future, we plan to research an automatic decision-making function to achieve intelligent resource allocation.Speaker: Dr Wenshuai Wang (Institute of High Energy Physics) -
6
Accelerating Detector Alignment Calibration with Real-Time Machine Learning on Versal ACAP Devices
Modern beam telescopes play a crucial role in high-energy physics experiments to precisely track particle interactions. Accurate alignment of detector elements in real-time is essential to maintain the integrity of reconstructed particle trajectories, especially in high-rate environments like the ATLAS experiment at the Large Hadron Collider (LHC). Any misalignment in the detector geometry can introduce systematic biases and potentially affect the accuracy of precision physics measurements. Current calibration systems that correct for these effects require substantial computational resources and these methods often lead to high operational costs and are often unable to handle rapidly changing conditions, leading to systematic inaccuracies and potential biases in physics measurements.
To address these challenges, we propose a calibration system that employs a lightweight neural network to predict the misalignment of the detectors in real time. Our approach utilizes multilayer perceptron (MLP) with hierarchical subset solver for deployment on heterogeneous computing platforms. The neural network predicts detector misalignments based on the detectors current positional data and statistical characteristics of particle trajectories.
This approach leverages ML to predict control parameters in real-time, allowing adaptation to complex nonlinear behaviors. However, this introduces a significant computational workload, as the optimization process involves frequent and dense matrix multiplications for gradient-based updates, which makes efficient hardware acceleration essential. By partitioning the application based on its computational characteristics: leveraging CPUs for sequential tasks, FPGAs for parallel workloads, and AI engine cores for fast and energy-efficient compute, we can achieve a balance of performance and cost. Deploying the algorithm on a heterogeneous computing device that leverages state-of-the-art ML-focused silicon processors could achieve cost-efficient implementation of real-time detector geometry calibration with real-time latency. This work is a step towards AI-driven real-time compute for future high-energy physics experiments on a Versal ACAP architecture, offering significant improvements in computational speed, resource utilization, and cost per watt.
Speaker: Akshay Malige (Brookhaven National Laboratory (US)) -
7
Advancing Awkward Arrays for High-Performance CPU and GPU Processing
Awkward Array provides efficient handling of large, irregular data structures in Python, playing a key role in high-energy physics analysis. This work presents ongoing efforts to optimize Awkward Arrays for GPUs using CUDA, aiming to achieve performance parity with or surpass CPU kernel implementations. Key improvements focus on optimized memory management, leveraging CUDA-specific features, and maximizing parallelism to enhance throughput. These advancements enable faster and more scalable data processing, particularly for HL-LHC data analysis within the Python ecosystem. We will discuss the challenges, solutions, and performance benchmarks.
Speaker: Ianna Osborne (Princeton University) -
8
AIDO: An End-to-end Detector Optimization Framework using Diffusion Models
The design of modern high-energy physics detectors is a highly intricate process, aiming to maximize their physics potential while balancing various manufacturing constraints. As detectors become larger and more sophisticated, it becomes increasingly difficult to maintain a comprehensive understanding of the entire system. To address this challenge, we aim to translate the design process into an optimization task suitable for Machine Learning by treating the parameters of the simulation as hyper-parameters of the model.
The AIDO framework is a generalized tool for the optimization of continuous and discrete detector parameters. We train a diffusion-based surrogate model on parallel Geant4 simulations with varying detector geometries, enabling the model to interpolate the expected performance across different configurations. This allows for gradient descent on the generated parameter space and identification of the optimal combination of parameters that maximizes a specific physics goal. As a demonstration, we show how this approach can be applied to generate an optimal sampling calorimeter by maximizing its energy resolution starting from a random initial composition.Speaker: Kylian Schmidt (KIT - Karlsruhe Institute of Technology (DE)) -
9
Applying Transaction Processing Techniques to Large Scale Analysis
Data analysts working with large datasets require absolute certainty that each file is processed exactly once. ServiceX addresses this challenge by using well established transaction processing architectures. This system implements a fully transactional workflow powered by PostgreSQL and RabbitMQ, ensuring data integrity throughout the processing pipeline. This presentation details both the infrastructure design that enables these transactional guarantees and the techniques we used to identify and remediate intermittent transaction leaks, resulting in a reliable system that operates consistently at scale.
Speaker: Benjamin Galewsky (Univ. Illinois at Urbana Champaign (US)) -
10
Automating the CMS ECAL calibration workflows for optimal performance in LHC Run 3
Many physics analyses using the CMS detector at the LHC require accurate, high resolution electron and photon energy measurements. The CMS electromagnetic calorimeter (ECAL) is a fundamental component of these analyses. The excellent resolution of ECAL was of central importance to the discovery of the Higgs boson in 2012, and is being used for increasingly precise measurements of Higgs boson properties, as well as Standard Model measurements, and searches for new physics that contain electromagnetic particles and jets in the final state.
Maintaining the optimal ECAL performance during LHC operation relies on the precise calibration of its energy and timing response. This is achieved via dedicated calibration workflows, using physics events, and a dedicated laser monitoring system. With the increased luminosity delivered during LHC Run 3 (2022+), detector aging effects have increased, requiring more frequent calibrations. To reduce the time needed for this task, a new system has been developed to automatically execute the calibration workflows. This new development is intended both to improve the quality of the reconstructed data (by facilitating more frequent updates) and to reduce the time and workload needed to provide the optimal calibrations for physics analyses, the latter of which were previously obtained at the end of each data-taking year.
The new system is based on industry standard tools (Openshift, Jenkins, Influxdb, and Grafana) for workflow automation and monitoring. The system architecture was previously presented during ACAT 2022. In this presentation we focus on recent developments and improvements, and the operational experience gained with this system during Run 3. Particular focus will be given to the improved performance obtained during the 2024 run, where the automated system was expanded to encompass the full range of energy calibration, signal pulse template, timing and detector alignment workflows.
Speakers: CMS Collaboration, Thomas Reis (Science and Technology Facilities Council STFC (GB)) -
11
b2luigi - bringing batch 2 luigi
Workflow Management Systems (WMSs) are essential tools for structuring an arbitrary sequence of tasks in a clear, maintainable, and repeatable way. The popular Python-based WMS luigi helps building complex workflows. It handles task dependency resolution and input/output tracking as well providing a simple workflow visualisation and a convenient command-line integration.
The extension b2luigi, which is designed to be a drop-in replacement of luigi, offers easy integration with batch systems such as HTCondor, LSF, Slurm, and the WLCG, allowing the combination of heterogeneous tasks and systems within a single workflow. Furthermore, b2luigi provides additional interfaces tailored for interactions with the Belle II analysis software framework and the Belle II distributed computing tools.
In November 2023, the Belle II collaboration took over the development of b2luigi. Since then, several new features have been introduced, such as the ability to run tasks with Apptainer, the support for the Slurm batch system, and the capacity to define targets using the XRootD protocol. The documentation has also been extended, including a step-by-step tutorial covering all the main features. The b2luig package has become essential not only for physics analyses, but also for managing a variety of complex workflows. These include: software release validation, data reprocessing, detector calibration, and the derivation of systematic corrections for analyses.
In this contribution, we present an overview of the current status of the b2luigi project, highlighting recent developments, new features, and the deployment within the Belle II collaboration. Furthermore, we discuss the adoption and application of b2luigi beyond the Belle II collaboration, demonstrating its versatility and broader relevance to the high-energy physics community.Speaker: Giacomo De Pietro (Karlsruhe Institute of Technology) -
12
BitPacket: A C++ Library for Dynamic Binary Decoding
At INAF (Istituto nazionale di astrofisica), in the contest of AGILE mission (Astro-Rivelatore Gamma a Immagini Leggero), we developed PacketLib, an open-source C++ software library designed for building applications that handle satellite telemetry source packets, provided they comply with the CCSDS Telemetry and Telecommand Standards.
As part of the ASTRI (Astrofisica con Specchi a Tecnologia Replicante Italiana) project, the need arose to modernize this approach to support the acquisition and decoding of binary data generated by the Cherenkov camera. From this requirement, BitPacket was born.
BitPacket is a lightweight C++ library designed for the dynamic parsing of acquired binary data from streams with configurable field lengths defined at the bit level. BitPacket enables users to decode structured binary formats using a simple, human-readable JSON schema that specifies field names and their bit-lengths.
BitPacket allows runtime selection of the parsing layout supporting dynamic schema switching: the user can interpret incoming binary streams with a generic format, detect a specialization discriminator field, and switch to a more specialized schema on the fly, enabling multi-layer decoding.
Binary data can originate from various sources, such as files and TCP streams.
An intuitive interface provides access to parsed fields, and users can define wrapper classes to post-process or access the fields semantically. The actual type conversion is handled externally, allowing the parser to remain agnostic and purely structural.
This contribution presents the design and implementation of the BitPacket library, providing an assessment of its capabilities and performance. The evaluation is conducted in the context of the Array Data Acquisition System (ADAS) of the ASTRI Mini-Array, where BitPacket has been successfully employed to decode and analyze the Cherenkov camera data in real-time.Speaker: Valerio Pastore (INAF) -
13
Bridging Workload and Data Management in CMS: Leveraging WMCore Microservices and Rucio Integration
The CMS Experiment had to manage and process data volumes approaching the exascale during the LHC Run 3. This required a seamless synergy between the workload and data management systems, namely WMCore and Rucio. Following up to the integration of Rucio into the CMS infrastructure, the workload management system has undergone substantial adaptations to harness new data management capabilities in order to streamline data placement, access and cleanup strategies. Moreover, computing operations tasks have been integrated into the workload management system to enhance automation, efficiency and reliability. This contribution delves into this transformative process, detailing the strategic plan and achievements of introducing microservices to gradually improve the interplay between workload and data management in CMS.
Speakers: Alan Malta Rodrigues (University of Notre Dame (US)), CMS Collaboration -
14
Charged Particle Tracking in Drift Chambers Using Reinforcement Learning
Charged particle tracking for drift chamber is a task in high-energy physics. In this work, we propose using reinforcement learning (RL) to the reconstruction of particle trajectories in drift chambers. By framing the tracking problem as a decision-making process, RL enables the development of more efficient and adaptive tracking algorithms. This approach offering improved performance and flexibility in optimizing end-to-end tracking algorithms for drift chambers.
Speaker: Yao Zhang -
15
ColliderML: The First Release of an OpenDataDetector High-Luminosity Physics Benchmark Dataset
Particle physics is a field hungry for high quality simulation, to match the precision with which data is gathered at collider experiments such as the Large Hadron Collider (LHC). The computational demands of full detector simulation often lead to the use of faster but less realistic parameterizations, potentially compromising the sensitivity, generalizability, and robustness of downstream machine learning (ML) models. To address this, we introduce the OpenDataDetector High-Luminosity Physics Benchmark Dataset 2025, aka “ColliderML”. It includes O(1 million) realistically simulated and digitised high-pileup collision events, across O(10) important SM and BSM channels. A variety of objects are available, from energy deposit information in the tracker and calorimeters, up to reconstructed tracks and jets, as well as a large dataset of particle gun simulations. The OpenDataDetector geometry itself provides a realistic combination of several next-generation detector technologies.
To demonstrate ColliderML's utility, we showcase multiple machine learning benchmarks that rigorously evaluate the performance and behavior of ML models trained under diverse collider conditions. These evaluations specifically examine critical ML aspects such as generalizability between fast and full simulation and across physics channels, the benefits of low-level and full-detector features, and robustness in handling complex and noisy collider data. Additionally, we provide an intuitive accompanying software library, streamlining dataset access and manipulation. As we find large ML models plateauing in performance on high-level physics objects, we propose ColliderML as an essential tool in exploring the next generation of ML on low-level collider data.
Speaker: Daniel Thomas Murnane (Niels Bohr Institute, University of Copenhagen) -
16
Computing the QED corrections to the Coulomb potential: an example of a 6-loop 2-scale calculation
QED in the classical Coulomb field of a nucleus serves as a good approximation for obtaining high-precision results in atomic physics. The external field is a subject to radiative corrections. These corrections can be explained perturbatively in terms of the QED Feynman diagrams containing scalar-like propagators for the external field together with the usual QED propagators.
A calculation of these corrections of orders $\alpha^2(Z\alpha)^3$ and $\alpha^2(Z\alpha)^5$ will be presented; here, $Z$ is the nucleus charge, $Z\alpha$ is used as a separate variable. The calculation uses the QED Feynman diagrams directly and determines the corrected potential as a function of the external momentum (on points). The diagrams incorporate two independent mass scales - electron mass and external momentum - and can have up to six independent loops.
Such high-order calculations would be impossible without a specialized method, which will be briefly explained. Special attention will be paid to the following points:
1. The divergence removal and renormalization is performed in Feynman parametric space, point by point, before integration; things like dimensional regularization are never used.
2. A special nonadaptive Monte Carlo integration algorithm is used to make the results accurate and to avoid the numerical instability caused by the presence of significantly different scales.The calculation method is a modification of the one used by the author for the 5-loop electron anomalous magnetic moment calculation.
The results will be compared to the previously known ones, and the implication for high-precision atomic physics will be discussed.
Speaker: Sergey Volkov -
17
Deep Learning Algorithm for dN/dx in a Pixelated TPC
Particle identification (PID) plays a crucial role in particle physics experiments. A groundbreaking advancement in PID involves cluster counting (dN/dx), which measures primary ionizations along a particle’s trajectory within a pixelated time projection chamber (TPC), as opposed to conventional dE/dx measurements. A pixelated TPC with a pixel size of 0.5x0.5 mm2 has been proposed as the gaseous detector for the Circular Electron Positron Collider (CEPC) to achieve exceptional hadron identification, which is particularly vital for flavor physics studies.
One of the major challenges in dN/dx lies in the development of an efficient reconstruction algorithm capable of extracting cluster signals from 2D pixel readouts. Machine learning algorithms have emerged as state-of-the-art solutions for PID. To address this challenge, we have designed a sophisticated simulation software framework that incorporates detector geometry, gas ionization, electron drift and diffusion, signal amplification, and pixel readout to generate large datasets. A deep learning algorithm tailored for point cloud data has been developed, utilizing a graph neural network implementation of the point transformer. By training the neural network on a substantial dataset of simulated events, the particle separation power has improved by 15% to 30% for pions and kaons within a momentum range of 2.5 to 20.0 GeV/c, compared to traditional dN/dx reconstruction algorithm.
Speaker: Dr Guang Zhao (Institute of High Energy Physics (CAS)) -
18
Detector and Event Visualization in JUNO
Multiple visualization methods have been implemented in the Jiangmen Underground Neutrino Observatory (JUNO) experiment and its satellite experiment JUNO-TAO. These methods include event display software developed based on ROOT and Unity. The former is developed based on the JUNO offline software system and ROOT EVE, which provides an intuitive way for users to observe the detector geometry, connect with the online DAQ system for monitoring, tune the reconstruction algorithm, and analyze the physics events. The latter is event display software developed based on Unity, which offers better display effects, local operation, and multi-platform support. This report will introduce the design framework and effects of the event display software of JUNO, list the advantages of each, and introduce the future development of JUNO visualization methods, including visualization based on Phoenix and VR.
Speaker: Minghua Liao (Sun Yat-Sen University (CN)) -
19
Detector and event visualization software for CEPC
Detector and event visualization software is essential for modern high-energy physics (HEP) experiment. It plays important role in the whole life circle of any HEP experiment, from detector design, simulation, reconstruction, detector construction and installation, to data quality monitoring, physics data analysis, education and outreach. In this talk, we will discuss two frameworks and their potentials in developing visualization software for CEPC. One is the Phoenix framework, which is based on JavaScript 3D library for web-based event display. The other is based on Unity, a popular industrial platform for game development and immersive experience creation. The applications of both frameworks in HEP experiments will also be introduced.
Speaker: Mr Yujie Zeng (Sun Yat-Sen University (CN)) -
20
Developing a simulation-based inference workflow in RooFit for analyses of semi-leptonic decays at LHCb
Simulation-based inference (SBI) is a set of statistical inference approaches in which Machine Learning (ML) algorithms are trained to approximate likelihood ratios. It has been shown to provide an alternative to the likelihood fits commonly performed in HEP analyses. SBI is particularly attractive in analyses performed over many dimensions, in which binning data would be computationally infeasible or would result in a loss of sensitivity.
In this work, SBI is applied to extract parameters of interest from the kinematic and angular distributions of $B^0 \to D^{\ast-}\mu^{+}\nu_\mu$ decays at LHCb, in pseudodata samples generated with RapidSim representative of the datasets used in LHCb analysis. The SBI fit is constructed using the RooFit framework, to which enhanced Python interfaces were recently introduced. Dense Neural Networks (DNNs) were trained to distinguish between the Standard Model and New Physics scenarios for varying parameters of interest. This workflow also incorporates the automatic differentiation (AD) of learned likelihoods, using the ROOT SOFIE framework to generate C++ code from the DNNs, from which gradient code is generated by source-code transformation AD with the Clad compiler plugin.
In this contribution, the SBI fit is compared to an equivalent template-based likelihood fit reflecting the current state-of-the-art. These fits are compared both in terms of statistical sensitivity and computational performance. Additionally, this contribution presents a direct application of AD to physics analysis.Speaker: Jamie Gooding (Technische Universitaet Dortmund (DE)) -
21
End-to-End MDC Track Reconstruction using Graph Neural Networks at BESIII
We present an end-to-end track reconstruction algorithm based on Graph Neural Networks (GNNs) for the main drift chamber of the BESIII experiment at the BEPCII collider. The algorithm directly processes detector hits as input to simultaneously predict the number of track candidates and their kinematic properties in each event. By incorporating physical constraints into the model, the reconstruction efficiency achieves parity with or surpasses traditional methods. Further improvements are anticipated as the research progresses.
Speaker: Liyan Qian (Chinese Academy of Sciences (CN)) -
22
evermore: Differentiable Binned Likelihood Functions with JAX
Statistical analyses in high energy physics often rely on likelihood functions of binned data. These likelihood functions can then be used for the calculation of test statistics in order to assess the statistical significance of a measurement.
evermore is a python package for building and evaluating these likelihood functions using JAX – a powerful python library for high performance numerical computing. The key concepts of evermore are performance and differentiability. JAX provides automatic differentiation, just-in-time (jit) compilation, and vectorization capabilities, which can be leveraged to improve the performance of statistical analyses. Jit-compilation and vectorization can be used for parallelizing fits on GPUs which is especially advantageous for likelihood scans and toy based upper limits.
We present the concepts of evermore, show its features, and give concrete examples of its performance in the context of a CMS analysis.
Speakers: CMS Collaboration, Felix Philipp Zinn (Rheinisch Westfaelische Tech. Hoch. (DE)) -
23
Expected Tracking Performance of the ATLAS Inner Tracker at the High-Luminosity LHC
A new all-silicon tracking detector (ITk) for ATLAS is under construction to meet the demands of the HL-LHC, and updated track reconstruction performance predictions have recently been published by the ATLAS Collaboration. The new detector, designed to operate at an average of up to 200 simultaneous proton–proton interactions, will provide a wider pseudorapidity coverage, an increased granularity and will outperform the current ATLAS Inner Detector. In this contribution the expected performance of the ITk detector will be presented, with emphasis on the improvements in track reconstruction. The expected improvements for downstream domains will also be discussed, such as flavor tagging, electron and photon reconstruction.
Speaker: Doğa Elitez (CERN) -
24
Fast Perfekt: Regression-based refinement of fast simulation
The availability of precise and accurate simulation is a limiting factor for interpreting and forecasting data in many fields of science and engineering. Often, one or more distinct simulation software applications are developed, each with a relative advantage in accuracy or speed. The quality of insights extracted from the data stands to increase if the accuracy of faster, more economical simulation could be improved to parity or near parity with more resource-intensive but accurate simulation. We present Fast Perfekt, a machine-learned regression that employs residual neural networks to refine the output of fast simulations. A deterministic morphing model is trained using a unique schedule that makes use of the ensemble loss function MMD, with the option of an additional pair-based loss function such as the MSE. We explore this methodology in the context of an abstract analytical model and in terms of a realistic particle physics application featuring jet properties in hadron collisions at the CERN Large Hadron Collider. The refinement makes maximum use of domain knowledge, and introduces minimal computational overhead to production.
Speaker: Lars Stietz (Hamburg University of Technology (DE)) -
25
FPGA-Based Digital Design for DAQ and Radiation-Hard CMOS Monolithic Sensors in High-Energy Physics
The development of radiation-hard CMOS Monolithic Active Pixel Sensors (MAPS) is a key advancement for next-generation high-energy physics experiments. These sensors offer improved spatial resolution and integration capabilities but require efficient digital readout and data acquisition (DAQ) systems to operate in high-radiation environments. My research focuses on the FPGA-based digital design for sensor readout and DAQ firmware development, optimizing real-time data processing and signal integrity. A critical aspect is ensuring the radiation tolerance of the electronics, addressing noise suppression and timing performance. As a first-year PhD student, I am beginning with theoretical and software-based digital design, gradually moving towards practical FPGA implementation. My motivation is to contribute to the development of robust and efficient detector electronics that can be used in future particle physics experiments, space applications, and medical imaging.
Speaker: Mr Sami Ullah Khan (University of Padua & INFN Turin) -
26
From bins to flows: a neural network approach to unbinned data-simulation corrections
Precise simulation-to-data corrections, encapsulated in scale factors, are crucial for achieving high precision in physics measurements at the CMS experiment. Traditional methods often rely on binned approaches, which limit the exploitation of available information and require a time-consuming fitting process repeated for each bin. This work presents a novel approach utilizing modern probabilistic machine learning techniques to compute multivariate and unbinned scale factors for CMS objects. A PyTorch-based likelihood function is developed, incorporating Normalizing Flows for signal and background distributions from simulation, and for conditional kinematic variable modeling. Neural networks are employed to parametrize data-simulation discrepancies such as detector efficiencies and variable transformations. Continuous scale factors are obtained by performing an unbinned maximum likelihood fit on data. Minimizing binning biases and improving scale factor representation, these machine learning methods exploit more observables and their correlations, with the potential to improve physics results precision.
Speakers: CMS Collaboration, Davide Valsecchi (ETH Zurich (CH)) -
27
Generalizing GANplification
As generative models start taking an increasingly prominent role in both particle physics and everyday life, quantifying the statistical power and expressiveness of such generative models becomes a more and more pressing question.
In past work, we have seen that a generative mode can, in fact, be used to generate samples beyond the initial training data. However, the exact quantification of the amplification factor between the statistical power of original training data and the statistical power of generated samples has had to rely on knowledge of the true distribution in some form.
We present a new approach of the prediction of the amplification factor of a generative model, which does not require knowledge of the true distribution, and demonstrate this both on meaningful constructed examples and on relevant physics datasets.
Speaker: Sascha Diefenbacher (Lawrence Berkeley National Lab. (US)) -
28
Generative Language Model for Simulating Particles Interacting with Matter
The simulation of particle interactions with detectors plays a critical role in understanding the detector performances and optimizing physics analysis. Without the guidance of the first-principle theory, the current state-of-the-art simulation tool, \textsc{Geant4}, exploits phenomenology-inspired parametric models, which must be combined and carefully tuned to experimental observations. The tuning process, even with the help of semi-auto tools like Professor, is laborious.
Generative language models showed outstanding performance in predicting the next tokens for a given prompt. Its capabilities in learning complex language patterns can be potentially leveraged to learn particle interactions from experimental data.
We introduce a Language Model-based framework for simulation particle detectors. In this framework, the particle information and detector hits will be tokenized into discrete numbers. And a transformer will be trained to learn the statistical correlations between the incoming particles and outgoing detector hits. Instead of directly predicting the detector hits, the transformer will predict the outgoing tokens, which then can be detokenized into detector hits. Our approach replaces the regression task with a multiclass classification task, which Transformers perform much better.
In addition to the introduction of a simulation framework, our contribution includes the introduction of a point cloud data-oriented particle tokenizer and a pre-trained GPT-like model for simulating particles interacting with detector materials.
Speakers: Jay Chan (Lawrence Berkeley National Lab. (US)), Xiangyang Ju (Lawrence Berkeley National Lab. (US)) -
29
GNN based noise filtering algorithm for tracking on STCF experiment
Track reconstruction is one of the most important and challenging tasks in the offline data processing of collider experiments. The Super Tau-Charm Facility (STCF) is a next-generation electron-positron collider running in the tau-charm energy region proposed in China, where conventional track reconstruction methods face enormous challenges from the higher background environment introduced by the higher collision luminosity.
In this contribution, we demonstrate a novel hybrid tracking algorithm based on Graph Neural Network (GNN) method and traditional methods for the STCF drift chamber. In the GNN method, a hit pattern map representing the connectivity between drift cells is constructed considering the geometrical layout of the sense wires, based on which we design an optimal graph construction method, then an edge-classifying graph neural network is trained to distinguish the hit-on-track from noise hits. Finally, the result after the noise filtering is integrated into the traditional tracking software where a track-finding algorithm based on the Hough transform is performed and a track-fitting algorithm based on GENFIT is used to obtain the track parameters.
Preliminary results based on the STCF MC sample, considering different background conditions, show promising performance with increased tracking efficiency, especially for the tracks with the low momentum (< 600 MeV/c), and reduced track fake rate compared to the traditional tracking method. Furthermore, the GNN based noise filtering algorithm can also be potentially applied to other collider experiments with similar drift chamber based trackers.Speaker: Xiaoshuai Qin (Shandong University (CN)) -
30
GPU graphs in the CMS software and alpaka
The CMS experiment requires massive computational resources to efficiently process the large amounts of detector data. The computational demands are expected to grow as the LHC enters the high-luminosity era. Therefore, GPUs will play a crucial role in accelerating these workloads. The approach currently used in the CMS software (CMSSW) submits individual tasks to the GPU execution queues, accumulating overhead as more tasks are submitted and using up more CPU resources.
CUDA and HIP graphs are a different task submission approach where a set of GPU operations are grouped together and connected by dependencies creating a directed task graph, that can later be executed as many times as required. This provides performance advantages over submitting individual tasks since a graph execution submits all GPU operations at once, reducing the launch overhead and freeing up CPU resources for other tasks to be executed.
A set of realistic tests that simulate different aspects of the CMS software were developed to measure the impact of using graphs, and their results were evaluated on different NVIDIA and AMD GPUs. Based on these results, work is ongoing to implement support for task graphs in alpaka, a performance-portable parallel programming framework used in the CMS software, to ensure efficient tasks submission and scheduling across different hardware architectures.
Speakers: Abdulrahman Al Marzouqi (University of Bahrain (BH)), CMS Collaboration -
31
GPU unified memory in the CMS software and alpaka
The CMS experiment requires massive computational resources to efficiently process the large amounts of detector data. The computational demands are expected to grow as the LHC enters the high-luminosity era. Therefore, GPUs will play a crucial role in accelerating these workloads. The approach currently used in the CMS software (CMSSW) relies on explicit memory management techniques, where data must be manually copied between CPU and GPU memory, leading to increased complexity and requiring careful synchronization to avoid performance bottlenecks.
Unified memory addresses this limitation. It simplifies working with complex data structures by automatically handling memory transfers between CPU and GPU without requiring explicit pointer adjustments. In addition to page fault handling, hardware prefetching, and automatic data migration, the latest generations of NVIDIA and AMD GPUs provide hardware features that further optimise unified memory, like cache-coherent access to the host memory or even a single unified memory pool. As GPUs evolve, unified memory is becoming more significant for writing efficient heterogeneous software.
The impact of unified memory has been studied using the CLUE library used in the CMS software. Based on these results, work is ongoing to implement support for unified memory in alpaka, a performance-portable parallel programming framework used in the CMS software, to ensure efficient tasks submission and scheduling across different hardware architectures.
Speakers: CMS Collaboration, Maria Michailidi (National and Kapodistrian University of Athens (GR)) -
32
Graph Neural Networks for event classification: A study of muon-electron scattering with silicon strip tracking
Precision measurements of particle properties, such as the leading hadronic contribution to the muon magnetic moment anomaly, offer critical tests of the Standard Model and probes for new physics. The MUonE experiment aims to achieve this through precise reconstruction of muon-electron elastic scattering events using silicon strip tracking stations and low-Z targets, while accounting for backgrounds like pair production. In this work, we present a Graph Neural Network (GNN) approach for event classification, where graph construction encodes spatial relationships among hits to capture underlying physics. For the first time, we test it on a simulated configuration featuring three tracking
stations.Speaker: Patrick Asenov (Universita & INFN Pisa (IT)) -
33
Hadron Identification based on DNN at BESⅢ
Machine learning (ML), a cornerstone of data science and statistical analysis, autonomously constructs hierarchical mathematical models—such as deep neural networks—to extract complex patterns and relationships from data without explicit programming. This capability enables accurate predictions and the extraction of critical insights, making ML a transformative tool across scientific disciplines.
Particle identification (PID) is a crucial aspect of most particle physics experiments. In the study of hadronic decays, efficient PID is always essential for improving signal-to-background ratios, refining physical analyses, and advancing scientific discovery. The BESIII experiment operates in the τ-charm energy region, where the final states of physical processes are frequently composed of hadronic particles. A persistent challenge in this experiment is PID performance, especially in distinguishing pions and kaons at high momenta. Conventional PID methods, which rely on measurements of ionization energy loss (dE/dx) and time-of-flight (TOF), often prove insufficient for the demands of precision physics analyses. To address this limitation, we leverage advanced ML algorithms and utilize the extensive high-dimensional measurements of the BESIII detector to optimize PID performance.
In this study, we present an advanced DNN-based PID framework, optimized through data preparation, feature engineering, architecture design, and hyperparameter tuning, to effectively integrate information from all four sub-detectors. Compared to conventional PID methods, the DNN-based algorithm demonstrates significant improvements in efficiency, particularly in enhancing the pion/Kaon discrimination in high-momentum regions. This advancement enables BESIII experiment to achieve higher precision in physical measurements and provides valuable insights for similar studies in the field.
Speaker: hyuan hyuan (中国科学院高能物理研究所) -
34
High-Performance Computing Workflow for Distributed Hyperparameter Search in Medium-Sized Machine Learning Models
Machine Learning (ML) plays an important role in physics analysis in High Energy Physics. To achieve better physics performance, physicists are training larger and larger models with larger dataset. Therefore, many workflow developments focus on distributed training of large ML models, inventing techniques like model pipeline parallelism. However, not all physics analyses need to train large models. On the contrary, some emerging analysis techniques like OmniFold and Neural Simulation-Based Inference (NSBI) need to train thousands of small models to quantify systematic uncertainties. At the same time, each model undergoes hyperparameter optimization with constraints of physics performance. Similarly, ML-powered online hardware often favors small performant models for data compression and intelligent data filtering. Performing extensive automated model search is crucial for designing intelligent hardwares. They present a unique challenge for developing HPC workflows.
We will present a HPC-friendly workflow that simultaneously tackles the aforementioned challenges. The workflow will be applied to a realistic physics analysis for the NSBI analysis using the Perlmutter platform at NERSC.The workflow design and scaling of the workflow will be presented in detail.
Speaker: Xiangyang Ju (Lawrence Berkeley National Lab. (US)) -
35
Improving the Automated Prompt Calibration at Belle II
The calibration of Belle II data involves two key processes: prompt
calibration and reprocessing. Prompt calibration represents the initial
step in continuously deriving calibration constants in a timely manner for
the data collected over the previous couple of weeks. Currently, this process
is managed by b2cal, a Python-based plugin built on Apache
Airflow to handle calibration jobs. However, b2cal is a complex
system with many interconnected components, introducing usability and maintenance challenges.To address these limitations, a new prototype system called
b2luca (b2LUigi CAlibration) is under
development. Built on b2luigi, a helper package for Spotify's Luigi for scheduling large workflows on a batch system, b2luca centralizes
all prompt calibration processes at the Belle II calibration center
hosted at the Scientific Data and Computing Center of the Brookhaven
National Laboratory (BNL). Here, all calibration tasks are executed
either in parallel or sequentially, depending on their dependencies, and the results
are stored in a centralized database. The system ensures robust
validation.Instead of relying on a custom web interface, b2luca leverages GitLab
for managing workflows, collecting expert feedback, and tracking
calibration tasks. This integration not only simplifies the workflow but
also fosters collaboration through GitLab's version control and
issue-tracking features.By running all calibration tasks directly on one site and incorporating
an efficient workflow scheduler, b2luca aims to provide a scalable,
user-friendly, and reliable solution for managing calibration pipelines
in the Belle II experiment.Speaker: Merna Abumusabh (IPHC - Strasbourg) -
36
Internal Criteria for Goodness in Unfolding and Their Application to the Comparison of Richardson-Lucy Deconvolution and Data Unfolding with Mean Integrated Error Optimization Methods
Unfolding can be considered a procedure for estimating an unknown probability density function. Both external and internal quality assessment methods can be used for this purpose.
In some cases, external criteria exist that allow for the gauging of the quality of deconvolution. A typical example is the deconvolution of a blurred image, where the sharpness of the unblurred image can be used to assess the quality of the result. In experimental physics, it is sometimes difficult to define such external criteria, especially when a measurement has never been done before. Therefore, internal criteria for assessing the goodness of the result, which do not require any reference to external information, are needed.
In this context, internal criteria for goodness are proposed and discussed. Their application is demonstrated in the comparison of Richardson-Lucy deconvolution and a new data unfolding method base on Mean Integrated Error Optimization.Speaker: Nikolay Gagunashvili -
37
Machine Learning for \(K^0\) Event Reconstruction in the LHCf Experiment
The LHCf experiment aims to study forward neutral particle production at the LHC, providing crucial data for improving hadronic interaction models used in cosmic ray physics. A key challenge in this context is the
measurement of $(K^0)$ production, indirectly reconstructed from the four photons originated by its decay. The main challenge in this measurement is the reconstruction of events with multiple calorimetric hits.To address this, we developed a machine learning pipeline that employs multiple neural networks to classify and reconstruct such events. The pipeline consists of four stages: (1) event identification, determining whether an event contains four particles, (2) photon/neutron discrimination for each particle hit, (3) event tagging into four specific topologies based on the distribution of photons between the two calorimeter towers, and (4) position and energy regression for each detected photon.
The model takes as input the energy deposits in each channel of the Arm2 detector, composed by two calorimetric towers with 16 GSO scintillator per tower and 4 pairs of silicon microstrip detectors oriented along the x and y axes, placed at different depths in the calorimeter. The network architecture is designed to process these heterogeneous data sources, allowing for a precise reconstruction of the event topology. Preliminary results, obtained with a dataset of 10k simulated events, show that the classification networks reach over 80\% accuracy in selecting relevant events and distinguishing photon/neutron interactions. These promising results highlight the potential of deep learning techniques in enhancing event reconstruction at LHCf and lay the groundwork for further improvements with larger datasets and refined models.
Speaker: Mr Andrea Paccagnella -
38
ML-Based Cluster Counting for Particle Identification
Cluster counting is a highly promising particle identification technique for drift chambers in particle physics experiments. In this paper, we trained neural network models, including a Long Short-Term Memory (LSTM) model for the peak-finding algorithm and a Convolutional Neural Network (CNN) model for the clusterization algorithm, using various hyperparameters such as loss functions, activation functions, numbers of neurons, batch sizes, and different numbers of epochs. These models were trained utilizing high performance computing (HPC) resources provided by the ReCas computing center. The best LSTM peak-finding model was selected based on the highest area under the curve (AUC) value, while the best CNN clusterization model was chosen based on the lowest mean square error (MSE) value among all configurations. The training was conducted on momentum ranges from 200 MeV to 20 GeV and 180 GeV.
The trained models (LSTM and CNN) were subsequently tested on samples with momenta of 2GeV, 4 GeV, 6 GeV, 8 GeV, 10 GeV and 180 GeV. The simulation parameters included 90% Helium and 10% Isobutane, a cell size of 0.8 cm, a sampling rate of 2 GHz, a time window of 400 ns, 10000 events, and a 45-degree angle between the muon particle track and the z-axis (sense wire) of the drift tube chamber. The testing aimed to evaluate the performance of the LSTM model for peak finding and the CNN model for clusterization.Speaker: Muhammad Numan Anwar (Universita e INFN, Bari (IT)) -
39
Model agnostic optimisation of weakly supervised anomaly detection
Weakly supervised anomaly detection has been shown to find new physics with a high significance at low injected signal cross sections. If the right features and a robust classifier architecture are chosen, these methods are sensitive to a very broad class of signal models. However, choosing the right features and classification architecture in a model-agnostic way is a difficult task as the underlying signal versus background classification task is dominated by noise. In this work, we systematically study a number of optimisation metrics to understand which are most robust in realistic, noisy conditions. Our findings provide practical guidance for improving the stability and performance of weakly supervised anomaly detection, making it a more reliable tool for model-independent new physics searches.
Speaker: Marie Hein (RWTH Aachen University) -
40
mplhep 1.0 (mplhep & plothist)
Visualizing pre-binned histograms is a HEP domain specific concern which is not adequately supported within the greater pythonic ecosystem. In recent years, mplhep has emerged as a leading package providing this basic functionality in a user-friendly interface. It also supplies styling templates for the four big LHC experiments - ATLAS, CMS, LHCb, and ALICE. At the same time, the plothist package has been developed to provide advanced methods for comparing data and model components. Both packages build on top of matplotlib and interface with the hist and boost-histogram libraries. Recognizing their overlap and complementary strengths, the packages are being merged into a single Python library in order to improve the user experience. This unified package, under the name mplhep and with the support of Scikit-HEP community, will streamline histogram visualization in high-energy physics and enable users to produce publication-ready figures with minimal effort.
Speakers: Andrzej Novak (Massachusetts Inst. of Technology (US)), Tristan Fillinger (KEK / IPNS) -
41
Neural Fake Factor Estimation
In a physics data analysis, "fake" or non-prompt backgrounds refer to events that would not typically satisfy the selection criteria for a given signal region, but are nonetheless accepted due to misreconstructed particles. This can occur, for example, when particles from secondary decays are incorrectly identified as originating from the hard scatter interaction point (resulting in non-prompt leptons), or when other physics objects, such as hadronic jets, are mistakenly reconstructed as leptons (resulting in fake leptons). These fake particles are taken into account by calculating a scale factor (a fake factor) and applying it as an event weight obtained by a data-driven technique. Traditionally, fake factors have been estimated by histogramming and computing the ratio of two distributions, typically as functions of a few relevant physics variables such as $p_{\mathrm{T}}$, $\eta$, and MET. In this work, we present a novel approach based on density ratio estimation using a transformer neural network trained directly on event data in a high-dimensional feature space. This enables the computation of a continuous, unbinned fake factor on a per-event basis, offering a more flexible, precise and higher-dimensional alternative to the conventional method.
Speakers: Jan Gavranovic (Jozef Stefan Institute (SI)), Jernej Debevc (Jozef Stefan Institute (SI)), Lara Calic (Lund University (SE)) -
42
Performance analysis of dynamically integrated HPC resources in the ATLAS workflow at the WLCG Tier-2 site in Freiburg
At many Worldwide LHC Computing Grid (WLCG) sites, HPC resources are already integrated, or will be integrated in the near future, into the experiment specific workflows. The integration can be done either in an opportunistic way to use otherwise unused resources for a limited period of time, or in a permanent way. The WLCG ATLAS Tier-2 cluster in Freiburg has been extended in both ways: opportunistic use of resources from the NEMO HPC cluster in Freiburg and permanent use of the HoreKa HPC cluster at KIT.
In order to integrate the computing resources into the Tier-2 cluster in Freiburg in a manner that is both transparent and efficient, a container-based approach was adopted, utilising the meta-scheduler COBalD/TARDIS. TARDIS launches so-called drones on the HPC cluster, which provide the Tier-2 cluster with additional resources. To differentiate these augmented resources from their counterparts installed in Freiburg, the accounting is handled by the AUDITOR accounting ecosystem.
The compute hardware of the local Tier-2 cluster and the HPC cluster NEMO are largely identical and were replaced simultaneously. This facilitated a comprehensive analysis of the impact of various factors. Firstly, the difference in identical hardware from the bare-metal installation of a typical WLCG compute server was compared with drones on the HPC clusters. Furthermore, the influence of direct access from the Freiburg Tier-2 cluster and the Freiburg HPC cluster to the Freiburg-based storage, as well as remote access from Karlsruhe HoreKa, was analysed. Finally, the impact of varying drone sizes was investigated. These results will have a significant impact on the German HEP community's computing strategy for the next 5-10 years.
Speaker: Michael Boehler (University of Freiburg (DE)) -
43
Performance of Lossless Data Compression Algorithms in the CMS Experiment
The High-Level Trigger (HLT) of the Compact Muon Solenoid (CMS) processes event data in real time, applying selection criteria to reduce the data rate from hundreds of kHz to around 5 kHz for raw data offline storage. Efficient lossless compression algorithms, such as LZMA and ZSTD, are essential in minimizing these storage requirements while maintaining easy access for subsequent analysis. Multiple compression techniques are currently employed in the experiment's trigger system. In this study, we benchmark the performance of existing lossless compression algorithms used in HLT for RAW data storage, evaluating their efficiency in terms of compression ratio, processing time, and CPU/memory usage. In addition, we investigate the potential improvements given by the introduction, in the CMSSW software framework, of RNTuples: the next-generation data storage format developed within the ROOT ecosystem. We explore how the new format can enhance data handling efficiency and reduce storage footprints compared to traditional storage solutions. With the upcoming Phase-2 upgrade of the CMS experiment, efficient compression strategies will be essential to ensure sustainable data processing and storage capabilities. This work provides insights into the different compression algorithms and how the new RNTuples data format can contribute to addressing the future data challenges of the CMS experiment.
Speakers: CMS Collaboration, Simone Rossi Tisbeni (Universita Di Bologna (IT)) -
44
Real-Time Stream Compaction for Sparse Machine Learning on FPGAs
Machine learning algorithms are being used more frequently in the {first-level triggers in collider experiments}, with Graph Neural Networks (GNNs) pushing the hardware requirements of FPGA-based triggers beyond the current state of the art. {As a first online event processing stage, first-level trigger systems process $O({10}~{\text{M}})$ events per second with a hard real-time latency requirement of $O({1}~{\text{us}})$.}
To meet the stringent demands of high-throughput and low-latency environments, we propose a concept for latency-optimized pre-processing of sparse sensor data, enabling efficient GNN hardware acceleration by removing dynamic input sparsity.Our approach rearranges data coming from a large number of First-In-First-Out (FIFO) interfaces, typically sensor frontends, to a smaller number of FIFO interfaces connected to a GNN hardware accelerator.
In order to achieve high throughput while minimizing the hardware utilization, we developed a hierarchical stream compaction pipeline optimized for FPGAs.We implemented our concept in the Chisel design language and integrate our open-source package as a parameterizable IP core with FuseSoC. For demonstration, we implemented one configuration of our IP core as pre-processing stage in a GNN-based first-level trigger for the {Electromagnetic Calorimeter (ECL) inside the Belle II detector}. Additionally we evaluate latency, throughput, resource utilization, and scalability for a wide range of parameters, to enable broader use for {other large scale scientific experiments}.
Speaker: Marc Neu -
45
Real-Time Unsupervised Anomaly Detection in the CMS Level-1 Trigger
The CMS experiment at the LHC has entered a new phase in real-time data analysis with the deployment of two complementary unsupervised anomaly detection algorithms during Run 3 data-taking. Both algorithms aim to enhance the discovery potential for new physics by enabling model-independent event selection directly at the hardware trigger level, operating at the 40 MHz LHC collision rate within nanosecond latency constraints. AXOL1TL, an autoencoder-based anomaly detection model, has been integrated into the Level-1 Global Trigger system while CICADA focuses on low-level calorimeter data, using a convolutional autoencoder architecture distilled into a compact supervised model for efficient inference on resource-constrained hardware. Both algorithms selects anomalous events for further processing, primarily contributing to scouting data streams. In this talk, we present results from both algorithms with data recorded since 2024. We describe the architecture, deployment, and commissioning of each algorithm, as well as their integration into the CMS trigger system. We highlight the complementary nature of the two approaches and discuss prospects for improvements in real-time anomaly detection strategies at colliders.
Speaker: Maciej Mikolaj Glowacki (CERN) -
46
Recent benchmarks in the Analysis Grand Challenge and integration with Combine (and HS3)
The Analysis Grand Challenge (AGC) showcases an example of HEP analysis. Its reference implementation uses modern Python packages to realize the main steps, from data access to statistical model building and fitting. The packages used for data handling and processing (coffea, uproot, awkward-array) have recently undergone a series of performance optimizations.
While not being part of the HEP Python (PyHEP) ecosystem, the Combine tool is a pillar of CMS analyses, covering more than 90% of the analyses published in the last few years. As such, it is necessary to have Combine integrated in the PyHEP ecosystem, using the AGC as example.
This project also includes, in the long-term, providing support and integration for the High Energy Physics Statistics Serialization Standard (HS3), as a way to have a language-independent way of representing the likelihood and use different frameworks interchangeably.In this talk we will cover part of the recent work performed on the AGC and Combine, including:
- performance benchmarks, covering benefits introduced by the recent improvements in the data processing packages;
- examples of how Combine can be integrated and run in a dedicated infrastructure (coffea-casa);
- examples and plans to integrate HS3 in Combine.Speaker: Massimiliano Galli (Princeton University (US)) -
47
Recent developments in the Awkward Array world
In recent years, Awkward Array, Uproot, and related packages have become the go-to solutions for performing High-Energy Physics (HEP) analyses. Their development is driven by user experience and feedback, with the community actively shaping their evolution. User requests for new features and functionality play a pivotal role in guiding these projects.
For example, the Awkward development team has been working on new features, performance improvements, and memory safety measures. Key achievements include:
- Named axes in Awkward Array
- Significant performance improvements in Awkward Array, with even larger gains for Vector in realistic analysis scenarios
- Non-growing memory consumption for consecutive reads of the same opened file in UprootIn this contribution, we discuss how Awkward Array continuously adapts to meet the diverse and evolving needs of its users. We will detail these key achievements and their impact on end-user analyses.
Speaker: Manfred Peter Fackeldey (Princeton University (US)) -
48
Refinement of calorimeter showers simulated with normalizing flow model
The simulation of calorimeter showers is computationally expensive, leading to the development of generative models as an alternative. Many of these models face challenges in balancing generation quality and speed. A key issue damaging the simulation quality is the inaccurate modeling of distribution tails. Normalizing flow (NF) models offer a trade-off between accuracy and speed, making them a promising approach. This work builds on the CaloINN NF model and introduces a set of post-processing modifications of analysis-level observables aimed at improving the accuracy of distribution tails. We used CaloChallenge datasets as well as simulations produced with Open Data Detector (ODD) to validate the method. The results show that introduced refinements enhance overall performance, achieving accuracy comparable to the most precise calorimeter shower models while maintaining the simulation speed of NF models. The study is conducted as part of the interTwin project, which develops Digital Twins for applications in physics and earth observation, demonstrating the use of the intertwin platform for calorimeter simulation.
Speaker: Corentin Allaire (IJCLab, Université Paris-Saclay, CNRS/IN2P3) -
49
Running ATLAS and CMS distributed computing on HPCs with fapptainer
The HEP communities have developed an increasing interest in High Performance Computing (HPC) centres, as these hold the potential of providing significant computing resources to the current and future experiments. At the Large Hadron Collider (LHC), the ATLAS and CMS experiments are challenged with a scale up of several factors in computing for the High Luminosity LHC (HL-HLC) Run 4, currently foreseen to begin by 2030. HPC platforms are not homogeneous, they expose a wide variation of environments, including proprietary software stacks, each with its own set of restrictions. The access barrier for the integration with the highly dynamic needs of the running HEP experiments remains high, and ad hoc solutions are typically needed in order to make use of such centres. The development of a common approach providing efficient utilisation of compute resources by abstracting the specifics of a particular machine is thus highly desirable. This work presents an integration technique developed for running ATLAS and CMS experiment computational workloads on the LUMI Supercomputer, and designed to be HPC centre agnostic. It leverages the capabilities of open source tools like the Advanced Resource Connector (ARC) middleware, CernVM-FS (CVMFS), SSH Filesystem (SSHFS) and common containerisation techniques, enhancing them with novel tools to overcome limitations of the container runtime provided by the HPC. The fapptainer tool implements un-nesting of the containers, running them sideways instead, without any modification to the workflow of the jobs. The tools run unprivileged and as such do not require system modification by the local sysadmins. The proposed technique can be used to integrate any HPC system that has SSH inbound access and a standard container runtime available, and by means of an ARC Computing Element node close to it. A wide range of current and future HPC machines meets the specified requirements, thus enabling wider adoption of such tools by the HEP community to integrate HPC resources.
Speaker: Tomas Lindén (Helsinki Institute of Physics (FI)) -
50
Running ATLAS and CMS distributed computing on HPCs with fapptainer
The HEP communities have developed an increasing interest in High Performance Computing (HPC) centres, as these hold the potential of providing significant computing resources to the current and future experiments. At the Large Hadron Collider (LHC), the ATLAS and CMS experiments are challenged with a scale up of several factors in computing for the High Luminosity LHC (HL-HLC) Run 4, currently foreseen to begin by 2030. HPC platforms are not homogeneous, they expose a wide variation of environments, including proprietary software stacks, each with its own set of restrictions. The access barrier for the integration with the highly dynamic needs of the running HEP experiments remains high, and ad hoc solutions are typically needed in order to make use of such centres. The development of a common approach providing efficient utilisation of compute resources by abstracting the specifics of a particular machine is thus highly desirable. This work presents an integration technique developed for running ATLAS and CMS experiment computational workloads on the LUMI Supercomputer, and designed to be HPC centre agnostic. It leverages the capabilities of open source tools like the Advanced Resource Connector (ARC) middleware, CernVM-FS (CVMFS), SSH Filesystem (SSHFS) and common containerisation techniques, enhancing them with novel tools to overcome limitations of the container runtime provided by the HPC. The fapptainer [1] tool implements un-nesting of the containers, running them sideways instead, without any modification to the workflow of the jobs. The tools run unprivileged and as such do not require system modification by the local sysadmins. The proposed technique can be used to integrate any HPC system that has SSH inbound access and a standard container runtime available, and by means of an ARC Computing Element node close to it. A wide range of current and future HPC machines meets the specified requirements, thus enabling wider adoption of such tools by the HEP community to integrate HPC resources.
[1] Fapptainer software - https://source.coderefinery.org/slu/fapptainer
Speakers: CMS Collaboration, Tomas Lindén (Helsinki Institute of Physics (FI)) -
51
subMIT: a CMS Analysis Facility at MIT
The High Luminosity Large Hadron Collider (HL-LHC) and future big science experiments will generate unprecedented volumes of data, necessitating new approaches to physics analysis infrastructure. We present the SubMIT Physics Analysis Facility, an implementation of the emerging Analysis Facilities (AF) concept at MIT. Our solution combines high-throughput computing capabilities with modern interactive analysis tools, bridging the gap between traditional grid computing and the need for interactive data exploration. The facility integrates local computing resources with external infrastructures such as the Open Science Grid and CMS Tier-2/3, while providing streamlined access to essential tools through JupyterHub, containerized environments, and CVMFS. Supporting users from undergraduate students to experienced researchers, the facility handles system maintenance and security transparently, allowing physicists to focus on scientific discovery. We present our experience with innovative user support strategies, including an experimental large language model application for interactive assistance, and discuss how this comprehensive approach enables efficient physics analysis workflows. Our findings provide valuable insights for the particle physics community as it adapts to the multi-exabyte scale of HL-LHC data processing and analysis requirements, demonstrating how modern Analysis Facilities can reduce technical barriers while maintaining the flexibility required for cutting-edge research.
Speakers: CMS Collaboration, David Walter (Massachusetts Inst. of Technology (US)) -
52
The Application of Multithreading in JUNO Offline Software
The JUNO offline software (JUNOSW) is built upon the SNiPER framework. Its multithreaded extension, MT-SNiPER, enables inter-event parallel processing and has successfully facilitated JUNOSW's parallelization. Over the past year, two rounds of JUNO Data Challenge (DC) have been conducted to validate the complete data processing chain. During these DC tasks, the performance of MT-SNiPER was rigorously tested through numerous 4-thread reconstruction jobs. However, the limitation of inter-event multithreading was also revealed. A key bottleneck was identified in the global event buffer, which keeps the events in right order but introduces synchronization overhead. This underscores the need for JUNO to implement fine-grained intra-event multithreading to complement the existing approach. We have accordingly developed a new architecture in SNiPER that supports both inter-event and intra-event multithreading. A prototype of the waveform reconstruction algorithm has yielded promising results, demonstrating the potential of the proposed multithreading architecture.
Speakers: Mr Jiaheng Zou (IHEP, Beijing), Wenxing Fang -
53
The Array Data Acquisition System of the ASTRI Mini-Array project: status and assessment
The ASTRI (Astrofisica con Specchi a Tecnologia Replicante Italiana) Project was born as a collaborative international effort led by the Italian National Institute for Astrophysics (INAF) to design and realize an end-to-end prototype of the Small-Sized Telescope (SST) of the Cherenkov Telescope Array (CTA) in a dual-mirror configuration (2M). The prototype, named ASTRI-Horn, has been operational since 2014 at the INAF observing station located on Mt. Etna (Italy). The ASTRI Project is now building the ASTRI Mini-Array consisting of nine ASTRI-Horn-like telescopes to be installed and operated at the Teide Observatory (Spain). The ASTRI software is aimed at supporting the Assembly Integration and Verification (AIV), and the operations of the ASTRI Mini-Array. The Array Data Acquisition System (ADAS) includes all hardware, software and communication infrastructure required to gather the bulk data of the Cherenkov Cameras and the Intensity Interferometers installed on the telescopes, and make these data available to the Online Observation Quality System (OOQS) for the on-site quick look, and to the Data Processing System (DPS) for the off-site scientific pipeline. This contribution presents the current status of ADAS software development, testing, and deployment, along with an assessment of its functionalities and performance in relation to the requirements specified in the latest released version.
Speaker: Vito Conforti -
54
The fundamental limit of jet tagging
Jet tagging, i. e. determining the origin of high-energy hadronic jets, is a key challenge in particle physics. Jets are ubiquitous observables in collider experiments, made of complex collections of particles, that need to be classified. Over the past decade, machine learning-based classifiers have greatly enhanced our jet tagging capabilities, with increasingly sophisticated models driving further improvements. This raises a fundamental question: How far are we from the theoretical limit of jet tagging performance? To explore this, we employ transformer-based generative models to produce realistic synthetic data with a known probability density function. By testing various state-of the-art taggers on this dataset, we find a significant gap between their performance and the theoretical optimum, signalling a significant room for improvement. Our dataset and software are made public to provide a benchmark task for future developments in jet tagging and other areas of particle physics.
Speaker: Dr Humberto Reyes-González (RWTH Aachen) -
55
The online event classification software in the JUNO experiment
The Jiangmen Underground Neutrino Observatory (JUNO) aims to determine the neutrino mass ordering (NMO) with a 3-sigma confidence level within six years. The experiment is currently in the commissioning phase, focusing on filling the liquid scintillator and evaluating detector performance. During physics data taking, the expected data rate after the global trigger is approximately 40 GB/s, which will be reduced to ~60 MB/s using online event classification (OEC) software.
This contribution presents the software design of the online event classification (OEC) system, including the multithreaded low-level event classification (LEC) and single-threaded high-level event classification (HEC) modules. Additionally, a middleware has been developed to facilitate the integration of offline algorithms into the online system. Finally, we discuss the computing performance observed during data acquisition.
Speaker: Wenxing Fang -
56
Towards more precise data analysis with Machine-Learning-based particle identification with missing data
Identifying products of ultrarelativistic collisions delivered by the LHC and RHIC colliders is one of the crucial objectives of experiments such as ALICE and STAR, which are specifically designed for this task. They allow for a precise Particle Identification (PID) over a broad momentum range.
Traditionally, PID methods rely on hand-crafted selections, which compare the recorded signal of a given particle to the expected value for a given particle species (i.e., for the Time Projection Chamber detector, the number of standard deviations in the dE/dx distribution, so-called "nσ" method). To improve the performance, novel approaches use Machine Learning models that learn the proper assignment in a classification task.
However, because of the various detection techniques used by different subdetectors (energy loss, time-of-flight, Cherenkov radiation, etc.), as well as the limited detector efficiency and acceptance, particles do not always yield signals in all subdetectors. This results in experimental data which include "missing values". Out-of-the-box ML solutions cannot be trained with such examples without either modifying the training dataset or re-designing the model architecture. Standard approaches to this problem used, i.e., in image processing involve value imputation or deletion, which may alter the experimental data sample.
In the presented work, we propose a novel and advanced method for PID that addresses the problem of missing data and can be trained with all of the available data examples, including incomplete ones, without any assumptions about their values [1,2]. The solution is based on components used in Natural Language Processing Tools and is inspired by AMI-Net, an ML approach proposed for medical diagnosis with missing data in patient records.
The ALICE experiment was used as an R&D and testing environment; however, the proposed solution is general enough for other experiments with good PID capabilities (such as STAR at RHIC and others). Our approach improves the F1 score, a balanced measure of the PID purity and efficiency of the selected sample, for all investigated particle species (pions, kaons, protons).
[1] M. Kasak, K. Deja, M. Karwowska, M. Jakubowska, Ł. Graczykowski & M. Janik, “Machine-learning-based particle identification with missing data”, Eur.Phys.J.C 84 (2024) 7, 691
[2] M. Karwowska, Ł. Graczykowski, K. Deja, M. Kasak, and M. Janik, “Particle identification with machine learning from incomplete data in the ALICE experiment”, JINST 19 (2024) 07, C07013
Speaker: Marek Mateusz Mytkowski (Warsaw University of Technology (PL)) -
57
Track reconstruction in silicon strip detector of MuonE experiment with Graph Neural Networks
Patrick Asenov (Universita & INFN Pisa (IT)), Anna Driutti (Universita & INFN Pisa (IT)), Mateusz Jacek Goncerz (Polish Academy of Sciences (PL)), Emma Hess (Universita & INFN Pisa (IT)), Marcin Kucharczyk (Polish Academy of Sciences (PL)), Damian Mizera (Cracow University of Technology (PL)), Marcin Wolter (Polish Academy of Sciences (PL)), Milosz Zdybal (Polish Academy of Sciences (PL))
The MUonE experiment aims to measure the hadronic contribution to the muon magnetic moment. To achieve this the reconstruction of muon-electron pair in the silicon strip detector should be done fast to include the information in the trigger system. In this work we present the Graph Neural Network (GNN) approach to the tracking using simulated MuonE events. For the first time we test the GNN pattern recognition in three dimensions. The tracking is tested on a simulated configuration featuring three tracking stations. In addition the particle identification is performed by the same GNN network to identify the reconstructed tracks as originating from muon or electron.
Speaker: Marcin Wolter (Polish Academy of Sciences (PL)) -
58
Tracking for the next ATLAS event filter with GNNs on GPUs
Graph Neural Networks (GNNs) have been in the focus of machine-learning-based track reconstruction for high-energy physics experiments during the last years. Within ATLAS, the GNN4ITk group has investigated this type of algorithm for track reconstruction at the High-Luminosity LHC (HL-LHC) using the future full-silicon Inner Tracker (ITk).
The Event Filter (EF) is part of the ATLAS Trigger and Data Acquisition (TDAQ) system and will consist of a computing farm that runs a limited set of event reconstruction algorithms to provide the accept/reject decision for offline storage. The decision on the system design choice for the EF farm is scheduled for late 2025.
We are exploring the use of the GNN4ITk track-finding approach alongside other tracking tools from the open-source ACTS (A Common Tracking Software) toolkit to develop a candidate pipeline targeting GPU hardware aimed at meeting the ATLAS Event Filter throughput and physics performance goals.
We will present the implementation strategy, optimizations and computing performance results, as well as the track reconstruction performance for the proposed candidate pipeline.
Speaker: Benjamin Huth (CERN) -
59
Triggering on Muon Detector Showers with CMS
Searches for long-lived particles (LLPs) have attracted much interest lately due to their high discovery potential in the LHC Run-3. Signatures featuring LLPs with long lifetimes and decaying inside the muon detectors of the CMS experiment at CERN are of particular interest. In this talk, we will describe a novel Level-1 trigger algorithm that significantly improves CMS's signal efficiency for these exotic signatures. The implementation has been done at multiple stages of the Level-1 trigger processing, considering limited FPGA logic resources and tight latency requirements. Events satisfying the Level-1 requirements are passed to a High-Level Trigger (HLT) algorithm that further reduces the rate to the target few Hz level. The developed trigger algorithms have been taking CMS data since 2022. Their performance, implementation, commissioning results, and potential future developments will be reported.
Speakers: Ayse Asu Guvenli (Hamburg University (DE)), CMS Collaboration -
60
UHI for ROOT: Interfacing With Python Statistical Analysis Libraries
The ROOT software package is a widely used data analysis framework in the High Energy Physics (HEP) community. As many other Python packages, ROOT features a powerful performance-oriented core which can be accessed in a Python application thanks to dynamic bindings. These facilitate the usage and integration of ROOT with the broader Python ecosystem.
Despite these capabilities, there is significant potential to enhance interoperability between ROOT and other Python libraries, particularly in the domains of statistical analysis and data manipulation.
The Unified Histogram Interface (UHI) introduces a standardized abstraction for histogram operations, aiming at enabling seamless interaction between diverse histogramming libraries that adopt this interface.
This work presents the implementation of the UHI specification for relevant statistical classes in ROOT and demonstrates its usability and compatibility with external tools. Furthermore, open questions and ongoing efforts to refine and expand this functionality are discussed.Speaker: Silia Taider -
61
User Centric Approaches to Sustainable Compute Operation
With the consequences of global warming becoming abundantly clear, physics research needs to do its part in becoming more sustainable, including its computing aspects. Many measures in this field expend great effort to keep the impact on users minimal. However, even greater savings can be gained when compromising on these expectations.
In any such approach affecting the user experience, the effects of suboptimal configurations must be minimized by proactive and rapid incorporation of user feedback. The intermediate size of our VISPA computing cluster offers the ideal playground to probe such measures in a highly interactive manner.
We present our approaches and insights thereon. Most important is the communication aspect, here we have several ways to inform the user about their environmental impact. But we also take active measures, such as shifting workloads in time, based on user guidance. Additionally, we simulate policy changes using a digital twin to estimate their effects on user experience.Speaker: Paul Gilles (RWTH Aachen University) -
62
Using Graph Neural Networks for hadronic clustering and to reduce beam background in the Belle~II electromagnetic calorimeter
The Belle~II electromagnetic calorimeter (ECL) is not only used for measuring electromagnetic particles but also for identifying and determining the position of hadrons, particularly neutral hadrons.
Recent data-taking periods have presented two challenges for the current clustering method:
Firstly, the record-breaking luminosities achieved by the SuperKEKB accelerator have increased background rates, leading to a higher number of crystals with energy depositions, and an overall increase in the total energy measured in the ECL.
This resulted in poorer photon energy resolution and the introduction of more fake photon clusters.
Secondly, challenges arise from the nature of hadronic interactions.
In contrast to $\gamma$ and $e^\pm$, hadrons interacting in the ECL result in irregular, sometimes even multiple clusters.
These clusters can be misinterpreted as photon clusters, thereby reducing the position resolution of neutral hadrons or causing a complete misidentification of the hadron.
Graph neural networks (GNNs) offer a promising solution to both challenges.
By representing crystals with an energy measurement as nodes, graphs capture the sparsity of the input.
Using message-passing layers that learn the graph edges also helps to address the asymmetry of Belle~II's ECL.
In this talk, I will present a novel approach to clustering in Belle~II's ECL that relies on GNNs to reduce the number of fake photons and to cluster both electromagnetic and hadronic interactions.
I will show that the approach reduces the number of fake photons, enhances the identification of hadrons, and improves the position resolution for neutral hadrons.Speaker: Jonas Eppelt (Karlsruher Insititute of Technology (KIT))
-
5
-
Plenary ESA A
ESA A
Conveners: chair: David Britton, co-chair: Fons Rademakers (CERN)-
63
The Next Generation Trigger Project at CERNSpeaker: Cristina Botta (CERN)
-
64
Foundation models for physicsSpeaker: Sascha Caron (Nikhef National institute for subatomic physics (NL))
-
65
Fast Machine Learning for ScienceSpeaker: Shih-Chieh Hsu (University of Washington Seattle (US))
-
63
-
13:00
Lunch break ESA O "East Wing"
ESA O "East Wing"
Today's menu:
Starter:
- Tomato and mozzarella with fresh basil
- Mixed salad with feta cheese, olives and peperoni
Served with different dressingsMain courses:
- Turkey strips and fresh mushroom in cream sauce with rice and vegetables
- Vegetable curry Indian style with Basmati rice (vegan, gluten-free, lactose-free) -
Track 1: Computing Technology for Physics Research ESA M
ESA M
Conveners: chair: Sioni Summers, co-chair: Fazhi Qi-
66
MLOps Pipeline for Continuous Deployment of Machine Learning Algorithms in the CMS Level-1 Trigger
We present an MLOps approach for managing the end-to-end lifecycle of machine learning algorithms deployed on FPGAs in the CMS Level-1 Trigger (L1T). The primary goal of the pipeline is to respond to evolving detector conditions by automatically acquiring up-to-date training data, retraining and re-optimising the model, validating performance, synthesising firmware, and deploying validated firmware into both online and offline environments.
In addition, we use the CMS Level-1 Scouting stream—which bypasses L1T selection—to detect drifts in the model output distribution. This enables us to quantify the operational lifetime of ML models deployed at the L1T and support continual learning strategies, such as triggering retraining or adjusting thresholds to maintain optimal performance.
The pipeline is built with CERN’s computing resources and integrates with the Kubeflow platform, GitLab CI/CD and the WLCG, offering a scalable solution for real-time ML deployment. This infrastructure lays the groundwork for rapid iteration and long-term sustainability of ML-based trigger algorithms—capabilities that will become increasingly important as ML continues to be adopted in changing, low-latency environments.
Speaker: Maciej Mikolaj Glowacki (CERN) -
67
Real-Time GNN-Based Hit Filtering on FPGAs for the Belle II Level-1 Trigger
The high-luminosity environment at Belle II leads to growing beam-induced background, posing major challenges for the Belle II Level-1 (L1) trigger system. To maintain trigger rates within hardware constraints, effective background suppression is essential. Hit filtering algorithms based on Graph Neural Networks (GNNs), including the Interaction Network (IN), have demonstrated successful applications in offline filtering scenarios.
To facilitate GNN-based hit filtering at the L1 trigger level, we adapt existing offline algorithms using state-of-the-art model compression and hardware-aware design techniques. This work presents an end-to-end hardware acceleration pipeline for the IN, considering not only the network itself but also preprocessing steps such as graph building. We optimize for O(1us) latency and high throughput of 32 million events per second while minimising resource utilisation of our neural network through operator fusion, combining graph building with static GNN message-passing operators. To meet Belle II’s real-time demands, we exploit spatial parallelism by partitioning the detector’s 14336 wires into independent FPGA-processing regions.
We validate our approach with a working prototype implemented on the AMD XCVU160 FPGA used in the Belle II Universal Trigger Board 4 (UT4).Speaker: Greta Sophie Heine -
68
Low-latency Jet Tagging for HL-LHC Using Transformer Architectures
Transformers are the state-of-the-art model architectures and widely used in application areas of machine learning. However the performance of such architectures is less well explored in the ultra-low latency domains where deployment on FPGAs or ASICs is required. Such domains include the trigger and data acquisition systems of the LHC experiments.
We present a transformer-based algorithm for jet tagging built with the HGQ2 framework, which is able to produce a model with heterogeneous bitwidths for fast inference on FPGAs, as required in the trigger systems at the LHC experiments. The bitwidths are acquired during training by minimizing the total bit operations as an additional parameter. By allowing a bitwidth of zero, the model is pruned in-situ during training. Using this quantization-aware approach, our algorithm achieves state-of-the-art performance while also retaining permutation invariance which is a key property for particle physics applications
Due to the strength of transformers in representation learning, our work serves also as a stepping stone for the development of a larger foundation model for trigger applications.
Speaker: Lauri Antti Olavi Laatu (Imperial College (GB)) -
69
CMS L1 Data Scouting for HL-LHC
The CMS Experiment at the CERN Large Hadron Collider (LHC) relies on a Level-1 Trigger system (L1T) to process in real time all potential collisions, happeing at a rate of 40 MHz, and select the most promising ones for data acquisition and further processing. The CMS upgrades for the upcoming high-luminosity LHC run will vastly improve the quality of the L1T event reconstruction, providing opportunities for a complementary Data Scouting approach where physics analysis is performed on a data stream containing all collisions but limited to L1T reconstruction. This talk will describes the future Data Scouting system, some first estimates of its physics capabilities, and the demonstration setups used to assess its technical feasibility.
Speakers: CMS Collaboration, Efe Yigitbasi (Rice University (US))
-
66
-
Track 2: Data Analysis - Algorithms and Tools ESA B
ESA B
Conveners: chair: Frank Gaede, co-chair: Daniel Murnane-
70
Developments of GNN Track Reconstruction for the ATLAS ITk Detector
Track reconstruction is a cornerstone of modern collider experiments, and the HL-LHC ITk upgrade for ATLAS poses new challenges with its increased silicon hit clusters and strict throughput requirements. Deep learning approaches compare favorably with traditional combinatorial ones — as shown by the GNN4ITk project, a geometric learning tracking pipeline that achieves competitive physics performance at sub-second inference times. In this contribution, we evaluate a range of pipeline configurations and machine learning inference strategies that further improve track reconstruction at lower latencies. We present benchmarks for latency, throughput, memory usage, and power consumption across these pipelines. New developments include improved GPU-based module map performance and memory optimizations; model enhancements through pruning, quantization and advanced compilation techniques used in industry; and a custom graph segmentation approach. These upgrades allow the pipeline to target trigger-level track reconstruction in certain conditions. We also discuss improvements in track fitting, integrations into traditional-learned hybrid pipelines, GNN-based seeding, triplet-wise processing of cluster features, and production readiness with inference-as-a-service.
Speaker: Jay Chan (Lawrence Berkeley National Lab. (US)) -
71
CyberPFA: Particle Flow Algorithm for Crystal Bar ECAL
Precision measurements of Higgs, W, and Z bosons at future lepton colliders demand jet energy reconstruction with unprecedented accuracy. The particle flow approach has proven to be an effective method for achieving the required jet energy resolution. We present CyberPFA, a particle flow algorithm specifically optimized for the particle-flow-oriented crystal bar electromagnetic calorimeter (ECAL) in the CEPC reference detector. This innovative calorimeter design combines excellent intrinsic energy resolution with cost efficiency, but introduces two major reconstruction challenges: (1) increased shower overlaps due to the material's large $R_M$ and $X_0/\lambda_I$, and (2) ambiguity problem caused by the perpendicular arrangement of crystal bars.
The issue of shower overlap has been solved by an energy-core-based pattern recognition method followed by an energy splitting process. The ambiguity problem has been addressed through the implementation of multiple optimized pattern recognition approaches. Integrated with the full detector simulation, CyberPFA achieves a 3.8% boson mass resolution for hadronic decays, exceeding the critical 4% threshold required for W/Z separation.
These results not only validate the long crystal bar ECAL as a viable design choice for future colliders but also highlight the exceptional performance of CyberPFA and the advanced nature of its energy-core-based reconstruction paradigm. The algorithm’s innovative approach to shower recognition is not only effective for the current design but can also be extended to other imaging calorimeter reconstruction algorithms, significantly improving their performance.
Speaker: Yang Zhang (Institute of High Energy Physics, Chinese Academy of Science) -
72
Attention-Enhanced Lightweight GNNs for LHCb Next-generation Particle reconstruction and Identification
We present lightweight, attention-enhanced Graph Neural Networks (GNNs) tailored for real-time particle reconstruction and identification in LHCb’s next-generation calorimeter. Our architecture builds on node-centric GarNet layers, which eliminate costly edge message passing and are optimized for FPGA deployment, achieving sub-microsecond inference latency. By integrating attention mechanisms and encoder-decoder structures, our models achieve up to 8× faster inference than traditional message-passing GNNs, while maintaining superior performance over conventional algorithms in terms of energy resolution. Through model compression and firmware-level integration, we enable real-time data filtering in the LHCb trigger system. This work highlights the synergy between efficient AI accelerators and high-energy physics, offering scalable solutions for future particle detection pipelines.
Speaker: Cilicia Uzziel Perez (La Salle, Ramon Llull University (ES)) -
73
GNN-based E2E reconstruction in different highly granular calorimeters
We present a versatile GNN-based end-to-end reconstruction algorithm for highly granular calorimeters that can include track and timing information to aid the reconstruction of particles. The algorithm starts directly from calorimeter hits and possibly reconstructed tracks, and outputs a coordinate transformation in which all shower objects are well separated from each other and assigned properties such as particle momenta and ID in one go.
We showcase the versatility of this approach by presenting its application to a detector that matches the complexity of the CMS high-granularity calorimeter, an ATLAS-like setting as well as future collider detectors.Speakers: Katharina sophia Schaeuble, Ulrich Einhaus (KIT - Karlsruhe Institute of Technology (DE)) -
74
TMVA SOFIE: Enhancements in ML Inference through graph optimizations and heterogeneous architectures
With the upcoming High-Luminosity upgrades at the LHC, data generation rates are expected to increase significantly. This calls for highly efficient architectures for machine learning inference in experimental workflows like event reconstruction, simulation, and data analysis.
At the ML4EP team at CERN, we have developed SOFIE, a tool within the ROOT/TMVA package that translates externally trained deep learning models—such as those in ONNX format or trained in Keras or PyTorch—into an intermediate representation(IR). This IR is then used to generate optimized C++ code for fast and lightweight inference, with BLAS as the only external dependency. The generated code can be embedded in any project, allowing inference functions to be called on event data that also allows user-defined modifications. This makes SOFIE both efficient and flexible for integration into high-energy physics workflows.
SOFIE supports a broad range of ML operators, primarily based on the ONNX standard, as well as additional operations common in other frameworks and custom user-defined functions. It also supports inference for in-memory graph neural network models trained with DeepMind’s Graph Nets library. The tool has been successfully validated on experiment models such as ParticleNet, ATLAS GNNs, and Smart Pixels.
Recent developments in SOFIE include performance gains through Structure-of-Arrays-based memory allocation, enabling memory reuse, extensibility to GPU memory, and support for user-provided memory handlers. Together with operator fusion and kernel-level optimizations, these enhancements significantly reduce data movement and improve inference latency.
SOFIE now also supports portable GPU inference via integrations with SYCL and ALPAKA, using backends such as cuBLAS (for NVIDIA) and rocBLAS (for AMD). This enables users the flexibility to select GPU stacks based on platform preference. We present recent optimizations and heterogeneous computing support in SOFIE, benchmarking its performance against other inference frameworks to demonstrate its efficiency and portability.Speaker: Enrico Lupi (CERN, INFN Padova (IT))
-
70
-
Track 3: Computations in Theoretical Physics: Techniques and Methods ESA C
ESA C
Conveners: chair: Chiara Signorile, co-chair: Aishik Ghosh-
75
Amplitude Surrogates for Multi-Jet Processes
Accurate and efficient predictions of scattering amplitudes are essential for precision studies in high-energy physics, particularly for multi-jet processes at collider experiments. In this work, we introduce a novel neural network architecture designed to predict amplitudes for multi-jet events. The model leverages the Catani–Seymour factorization scheme and uses MadGraph to compute amplitudes for related processes with fewer jets. By exploiting this factorization structure, the network learns to predict a correction factor that transforms the reduced-jet amplitude into the full multi-jet amplitude. This hybrid approach combines the strengths of theoretical factorization and data-driven learning, offering a promising direction for fast and scalable amplitude predictions.
Speaker: Luca Beccatini -
76
Machine-Learned Leading-Color Amplitude Reweighting for MadGraph
Direct simulation of multi-parton QCD processes at full-color accuracy is computationally expensive, making it often impractical for large-scale LHC studies. A two-step approach has recently been proposed to address this: events are first generated using a fast leading-color approximation and reweighted to full-color accuracy. We build upon this strategy by introducing a machine-learning algorithm that learns the reweighting function directly. This enables us to retain the fast generation of approximate events while accelerating the costly reweighting step.
Speaker: Javier Mariño Villadamigo (Institut für Theoretische Physik - University of Heidelberg) -
77
Speeding up amplitude analysis with a CAS and array-oriented computing
One of the central tools in hadron spectroscopy is amplitude analysis (partial-wave analysis) to interpret the experimental data. Amplitude models are fitted to data with large statistics to extract information about resonances and branching fractions. In amplitude analysis, we require flexibility to implement models with different decay hypotheses, spin formalisms, and resonance parametrisations, but also require computational performance to quickly fit the models to large datasets.
Computational performance can nowadays easily be achieved with the use of array-oriented libraries like JAX, TensorFlow, and Numba, which allow users to write backend-agnostic code for different types of accelerators like GPUs and multithreaded CPUs. The ComPWA project provides an additional layer of flexibility by formulating amplitude models with a Computer Algebra System (CAS) and using the expression trees to generate array-oriented code for multiple libraries. In addition, the use of a CAS results in a transparent, self-documenting workflow that further bridges the gap between theory and code.
Speaker: Remco De Boer -
78
Tropical sampling from Feynman measures
I will present work on Tropical sampling from Feynman measures:
We introduce an algorithm that samples a set of loop momenta distributed as a given Feynman integrand. The algorithm uses the tropical sampling method and can be applied to evaluate phase-space-type integrals efficiently. We provide an implementation,
momtrop
, and apply it to a series of relevant integrals from the loop-tree duality framework. Compared to naive sampling methods, we observe convergence speedups by factors of more than $10^6$.This is a joint work with Michael Borinsky.
Speaker: Mathijs Fraaije
-
75
-
Poster session with coffee break: Group 1 ESA W 'West Wing'
ESA W 'West Wing'
-
Track 1: Computing Technology for Physics Research ESA M
ESA M
Conveners: chair: Fazhi Qi, co-chair: Sioni Summers-
79
ACTS Integration for ATLAS Phase-II Track Reconstruction
The ATLAS experiment has engaged in a modernization of the reconstruction software to cope with the challenging running conditions expected for HL-LHC operations. The use of the experiment-independent ACTS toolkit for track reconstruction is a major component of this effort, involving the complete redesign of several elements of the ATLAS reconstruction software. This contribution will describe the ACTS integration effort and showcase the expected track reconstruction performance, highlighting the advantages and improvements achieved by switching to the ACTS toolkit.
Speaker: Andreas Stefl (CERN) -
80
CLUEstering: a novel high-performance clustering library for scientific computing
CLUEstering is a versatile clustering library based on CLUE, a density-based weighted clustering algorithm optimized for high-performance computing that supports clustering in an arbitraty. The library offers a user-friendly Python interface and a C++ backend to maximize performance. CLUE’s parallel design is tailored to exploit modern hardware accelerators, enabling it to process large-scale datasets with strong scalability and speed.
To ensure performance portability across diverse architectures, the backend is implemented using alpaka, a C++ performance portability library that enables near-native performance on a wide range of accelerators with minimal code duplication. CLUEstering's unique combination of density-based and weighted clustering makes it a unique among popular clustering algorithms, many of which lack built-in support for such combination.
This work will show comprehensive clustering results and performance benchmarks against other state-of-the-art algorithms.Speaker: Simone Balducci (Universita e INFN, Bologna (IT)) -
81
Static compilation of Julia packages for integration with exisitng HEP codebases: a case study with JetReconstruction.jl
The Julia programming language is considered a strong contender as a future language for high-energy physics (HEP) computing. However, transitioning to the Julia ecosystem will be a long process and interoperability between Julia and C++ is required. So far several successful attempts have been made to wrap HEP C++ packages for use in Julia. It is also important to explore the reverse direction, allowing Julia code to be called from existing HEP codebases, written primarily in C++ and Python, which would significantly improve potential adoption of Julia code. With recent developments in Julia enabling it to produce statically compiled code, this approach is becoming increasingly feasible, and investigating this potential for the benefit of HEP community is the focus of this work.
This work presents a case study of statically compiling the JetReconstruction.jl package - a highly performant, native Julia implementation of sequential jet reconstruction algorithms. Two different backends for Julia code compilation are compared: the existing PackageCompiler.jl and the new static compilation feature of Julia language, which is one of the major improvements in the upcoming Julia 1.12 release. The performance of the statically compiled JetReconstruction.jl is compared with both native Julia code and C++ FastJet.
Speaker: Mateusz Jakub Fila (CERN) -
82
Quantum & quantum-inspired optimization at high energy colliders
Future colliders such as the High Luminosity Large Hadron Collider and Circular Electron Positron Collider will face enormous increase in dataset in the coming decades. Quantum and quantum-inspired algorithms may allow us to overcome some of such challenges. There is an important class of problems the so-called combinatorial optimization problems. They are non-deterministic polynomial time (NP) complete problem: no efficient algorithm exists to find the solution. They can be mapped to Ising problems, for which Ising machines can provide quasi-optimal answers in a reasonable amount of time. Quantum computers and algorithms could play important roles in solving these problems. One of the active application areas of quantum optimization in high energy physics is reconstruction. I will present recent progress in formulating HEP tasks (e.g. track and jet finding, etc.) as optimization problems to be solved by quantum or quantum-inspired algorithms.
Speaker: Hideki Okawa (Chinese Academy of Sciences (CN))
-
79
-
Track 2: Data Analysis - Algorithms and Tools ESA B
ESA B
Conveners: chair: Davide Valsecchi, co-chair: Daniel Murnane-
83
On focusing statistical power for searches and measurements in particle physics
Particle physics experiments rely on the (generalised) likelihood ratio test (LRT) for searches and measurements. This is not guaranteed to be optimal for composite hypothesis tests, as the Neyman-Pearson lemma pertains only to simple hypothesis tests. An improvement in the core statistical testing methodology would have widespread ramifications across experiments. We discuss an alternate test statistic that provides the data analiser an ability to focus the power of the test in physics-motivated regions of the parameter space. We demonstrate the improvement from this technique compared to the LRT on the Higgs->tautau HiggsML dataset simulated by the ATLAS experiment and a dark matter (WIMPs) dataset inspired by the LZ experiment. This technique can be coupled with neural simulation-based inference techniques to maximally leverage information available in complex particle physics data. This technique also employs machine learning to perform the Neyman construction that is essential to ensure valid confidence intervals.
Speaker: Aishik Ghosh (University of California Irvine (US)) -
84
Parameter Estimation with Neural Simulation-Based Inference in ATLAS
Neural Simulation-Based Inference (NSBI) is a powerful class of machine learning (ML)-based methods for statistical inference that naturally handle high dimensional parameter estimation without the need to bin data into low-dimensional summary histograms. Such methods are promising for a range of measurements at the Large Hadron Collider, where no single observable may be optimal to scan over the entire theoretical phase space under consideration, or where binning data into histograms could result in a loss of sensitivity. This work develops an NSBI framework that, for the first time, allows NSBI to be applied to a full-scale LHC analysis, by successfully incorporating a large number of systematic uncertainties, quantifying the uncertainty coming from finite training statistics, developing a method to construct confidence intervals, and demonstrating a series of intermediate diagnostic checks that can be performed to validate the robustness of the method. As an example, the power and feasibility of the method are demonstrated for an off-shell Higgs boson couplings measurement in the four lepton decay channel, using ATLAS experiment simulated samples. The proposed method is a generalisation of the standard statistical framework at the LHC, and can benefit a large number of physics analyses. This work serves as a blueprint for measurements at the LHC using NSBI.
Speaker: R D Schaffer (Université Paris-Saclay (FR)) -
85
Modular Data-Driven Calibration and Analysis Correction in ALICE
We present a modular, data-driven framework for calibration and performance correction in the ALICE experiment. The method addresses time- and parameter-dependent effects in high-occupancy heavy-ion environments, where evolving detector conditions (e.g., occupancy and cluster overlaps, gain drift, space charge, dynamic distortions, and reconstruction or calibration deficiencies) require calibration techniques that go beyond static models and are difficult to reproduce accurately in Monte Carlo simulations of fundamental processes.
In contrast to traditional machine learning (ML) approaches that rely on large monolithic neural networks, our strategy is based on small, composable models — each representing a well-defined correction level. These models are interpretable, testable, and validated individually, allowing robust assembly into a global calibration pipeline. Many of these components are already in active use in ALICE reconstruction and calibration workflows.
Post-calibration corrections and MC/data mappings are performed using representative sampling, reweighting/remapping, and interactive multidimensional diagnostics. We use Python-based statistical libraries, RootInteractive for interactive QA and visualization of group-by statistics and parametric biases, and fastMCKalman to compute numerical derivatives of performance observables with respect to calibration parameters — enabling the propagation of imperfections and uncertainties, and optimization in high-dimensional spaces.
By combining modular ML techniques, effective modeling, and robust performance diagnostics, our framework offers a scalable and physically grounded alternative to end-to-end black-box models, enabling resilient calibration and analysis in evolving detector environments.
Speaker: Marian I Ivanov (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE)) -
86
Machine learning applications in the JUNO experiment
Jiangmen Underground Neutrino Observatory (JUNO) is a next generation 20-kton liquid scintillator detector under construction in southern China. It is designed to determine neutrino mass ordering via the measurement of reactor neutrino oscillation, and also to study other physics topics including atmospheric neutrinos, supernova neutrinos and more. The detector's large mass and high photosensor coverage provide an excellent scenario for the application of machine learning techniques. In this contribution, I present the recent progress of machine learning applications in JUNO, including event reconstruction and particle identification etc., which show great potential on enhancing the detector's performance for various physics topics.
Speaker: Hongyue Duyang (Shandong University)
-
83
-
Track 3: Computations in Theoretical Physics: Techniques and Methods ESA C
ESA C
Conveners: chair: Nina Elmer, co-chair: Chiara Signorile-
87
Version 5 of the FORM computer algebra system
For several decades, the FORM computer algebra system has been a crucial software package for the large-scale symbolic manipulations required by computations in theoretical high-energy physics. In this talk I will present version 5, which includes an updated built-in diagram generator, greatly improved polynomial arithmetic performance through an interface to FLINT, and enhanced capabilities for working with arbitrary-precision floating-point coefficients and evaluation of special functions.
Speaker: Joshua Davies (University of Liverpool) -
88
ggxy: NLO QCD corrections to loop induced $gg\to XY$ processes
We present the program package ${\tt ggxy}$, which in its first version can be used to calculate partonic and hadronic cross sections to Higgs boson pair production at NLO QCD. The 2-loop virtual amplitudes are implemented using analytical approximations in different kinematic regions, while all other parts of the calculation are exact. This implementation allows to freely modify the masses of the top quark and the Higgs boson, as well as the renormalization scheme of the top-quark mass. Finally, we discuss the status of including other processes in our framework, such as $gg\to ZH$ or $gg\to ZZ$.
Speaker: Daniel Stremmer (KIT) -
89
Portable Parton-Level Event Generation for the High-Luminosity LHC
Significant computing resources are used for parton-level event generation for the Large Hadron Collider (LHC). The resource requirements of this part of the simulation toolchain are expected to grow further in the High-Luminosity (HL-LHC) era. At the same time, the rapid deployment of computing hardware different from the traditional CPU+RAM model in data centers around the world mandates a change in event generator design to provide sustainable simulations for the HL-LHC and future colliders.
We present the parton-level event generators Pepper and MadGraph4GPU, and discuss their performance and HPC scaling for providing expensive background samples at the LHC. We further showcase current developments, such as ML-optimised phase-space generation optimised by Normalizing Flow models, higher-order calculations, and physics-driven improvements of the numerical stability in infrared limits.
Speaker: Enrico Bothmann (CERN)
-
87
-
18:30
Reception Ruderclub Favorite Hammonia
Ruderclub Favorite Hammonia
Alsterufer 9, 20354 Hamburg
-
08:00
-
-
Plenary ESA A
ESA A
Conveners: chair: Gang Chen, co-chair: Mikael Kuusela (Carnegie Mellon University (US))-
90
Updates from organizersSpeaker: Gregor Kasieczka (Hamburg University (DE))
- 91
-
92
DUNE: algorithmic and computing challengesSpeakers: Dr Michael Hudson Kirby (Brookhaven National Laboratory (US)), Michael Hudson Kirby (Brookhaven National Laboratory (US))
-
90
-
Poster session with coffee break: Group 1 ESA W 'West Wing'
ESA W 'West Wing'
-
Plenary ESA A
ESA A
Convener: chair: Doris Kim-
93
SONIC: A Portable framework for as-a-service ML servingSpeaker: Yuan-Tang Chou (University of Washington (US))
- 94
-
93
-
12:45
Conference Photo In front of the historic main building
In front of the historic main building
-
13:00
Lunch break ESA O "East Wing"
ESA O "East Wing"
Today's menu:
Starter:
- Antipasti - mushroom, eggplant, courgettes, carrots etc.
- Rocket salad with shaved Grana Padano cheese
served with different dressingsMain course:
- Chili con carne with minced beef
- Chili sin carne (vegan, gluten-free, lactose-free)
served with joghurt / creme fraiche and party rolls from the bread basket -
Track 1: Computing Technology for Physics Research ESA M
ESA M
Conveners: chair: Nicholas Smith, co-chair: Sioni Summers-
95
Accelerating Deployment of FPGA-based AI in hls4ml with Parallel Synthesis through Model Partitioning
The increasing reliance on deep learning for high-energy physics applications demands efficient FPGA-based implementations. However, deploying complex neural networks on FPGAs is often constrained by limited hardware resources and prolonged synthesis times. Conventional monolithic implementations suffer from scalability bottlenecks, necessitating the adoption of modular and resource-aware design paradigms. hls4ml, an open-source tool developed to translate machine learning models into FPGA-compatible architectures, has been instrumental in this effort but still faces synthesis bottlenecks for large networks. To address this challenge, we introduce a novel partitioning methodology that integrates seamlessly with hls4ml, allowing users to segment neural networks at predefined layers. This approach facilitates parallel synthesis and enables stepwise optimization, thus complementing both scalability and resource efficiency. The partitioned components are systematically reassembled into a unified architecture through an automated workflow leveraging AMD Vivado, ensuring functional correctness while minimizing manual intervention. An automated RTL-level testbench verifies system-wide correctness, eliminating manual validation steps and accelerating deployment. Experimental evaluations on convolutional neural networks, including ResNet20, demonstrate up to a 3.5× reduction in synthesis time, alongside enhanced debugging flexibility, thereby improving FPGA prototyping and deployment.
Speaker: Dimitrios Danopoulos (CERN) -
96
End-to-end hardware-aware model compression and deployment with PQuant and hls4ml
Machine learning model compression techniques—such as pruning and quantization—are becoming increasingly important to optimize model execution, especially for resource-constrained devices. However, these techniques are developed independently of each other, and while there exist libraries that aim to unify these methods under a single interface, none of them offer integration with hardware deployment libraries such as hls4ml. To address this, we introduce PQuant, a Python library that simplifies the training and compression of machine learning models by providing an interface for applying a variety of pruning and quantization methods. PQuant is designed to be accessible to users without specialized knowledge of compression algorithms, while still offering deep configurability. It integrates with hls4ml, allowing compressed models to be directly utilized by FPGA-based accelerators. This makes it a valuable tool for both researchers comparing compression strategies and practitioners targeting efficient deployment on edge devices and custom hardware.
Speaker: Roope Oskari Niemi -
97
Track reconstruction on FPGA networks at high speed for high density conditions
With plans for upgrading the detector in order to collect data at a luminosity up to 1.5×1034 cm-2s-2 being ironed out (Upgrade II - LHC Run5), the LHCb Collaboration has sought to implement new data taking solutions already starting from the upcoming LHC Run4 (2030-2033).
The first stage of the LHCb High Level Trigger (HLT1), currently implemented on GPUs and aiming at reducing the event rate from 30MHz to 1MHz, relies on track reconstruction at the input rate. With the target of ensuring the continued ability of reconstructing tracks in real-time for all collision events at increased luminosity, a tracking system based on a cluster of interconnected FPGAs has been approved to be built for the upcoming Run4 as an R&D effort: the DoWnstream Tracker (DWT) [1, 2, 3].
By reconstructing track stubs in the SciFi subdetector, downstream to the magnet, the DWT will both provide immediate benefits, by speeding up HLT1 reconstruction, and also give the opportunity for building knowledge and experience on such a system, as an R&D effort itself, in view of the Upgrade II.
The DWT relies on the Artificial Retina architecture, which can be seen as a multidimensional Hough Transform, computed numerically starting from a set of reference tracks, instead of seeking the analytical solutions. Because of this, the computation time of this tracking architecture scales linearly with the number of hits in the detector, instead of their possible combination [4].
Hits are delivered only to the elemental processing units inside the FPGAs which reconstruct tracks compatible with coordinates of the hit in question. This is achieved through a network which implements a programmable switching system, trained for optimal performance [5].
In this contribution we will show how such a switch is implemented on FPGA and how we can keep the per-event computation time constant at increasing luminosity, by linearly enlarging the system itself, optimising the switch and introducing the concept of input matrices, thus making the architecture desiderable for the future Upgrade II.
Speaker: Federico Lazzari (Universita di Pisa & INFN Pisa (IT)) -
98
Rapid ML inference in HEP using logic gate neural nets
Fast machine learning (ML) inference is of great interest in the HEP community, especially in low-latency environments like triggering. Faster inference often unlocks the use of more complex ML models that improve physics performance, while also enhancing productivity and sustainability. Logic gate networks (LGNs) currently achieve some of the fastest inference times for standard image classification tasks. In contrast to traditional neural networks, each node consists of a learnable logic gate. While this is generally slow when training, inference allows for a network that is implicitly pruned and discretized. LGNs are excellent candidates for FPGA implementations as they consist of logic gates, but they are also suitable for GPUs. In this work, we present our implementation of logic gate convolutional neural nets. We apply them to open data for anomaly detection, similar to that used at the CMS Level-1 Trigger by the CICADA collaboration. We demonstrate that LGNs offer comparable physics performance to existing methods while promising a much faster inference speed. This opens the door to broader applications of LGNs in fast ML workflows across HEP.
Speaker: Liv Helen Vage (Princeton University (US)) -
99
SuperSONIC: Cloud-Native Infrastructure for ML Inferencing
The rising computational demands of growing data rates and complex machine learning (ML) algorithms in large-scale scientific experiments have driven the adoption of the Services for Optimized Network Inference on Coprocessors (SONIC) framework. SONIC accelerates ML inference by offloading tasks to local or remote coprocessors, optimizing resource utilization. Its portability across diverse coprocessors enhances data processing and model deployment efficiency for advanced research in high-energy physics (HEP) and multi-messenger astrophysics (MMA). We developed SuperSONIC, a scalable server infrastructure for SONIC, enabling the deployment of computationally intensive tasks on Kubernetes clusters equipped with graphics processing units (GPUs). Leveraging NVIDIA’s Triton Inference Server, SuperSONIC decouples client workflows from server infrastructure, standardizing communication, improving throughput, and enabling robust load balancing and monitoring. Successfully deployed for the CMS and ATLAS experiments at CERN’s Large Hadron Collider, the IceCube Neutrino Observatory, and the LIGO gravitational-wave observatory, SuperSONIC has been tested on Kubernetes clusters at Purdue University, the National Research Platform, and the University of Chicago. SuperSONIC provides a reusable, configurable framework that addresses Cloud-native challenges, enhancing accelerator-based inference efficiency across diverse scientific and industrial applications.
Speaker: Yuan-Tang Chou (University of Washington (US))
-
95
-
Track 2: Data Analysis - Algorithms and Tools ESA B
ESA B
Conveners: chair: Luisa Lucie-Smith, co-chair: Louis Moureaux-
100
Reconstructing tau leptons with a cross-task, cross-detector foundation model
The application of foundation models in high-energy physics has recently been proposed as a way to use large unlabeled datasets to efficiently train powerful task-specific models. The aim is to train a task-agnostic model on an existing large dataset such that the learned representation can later be utilized for subsequent downstream physics tasks.
The pretrained model can reduce the training dataset size needed in the fine-tuning phase to reach the same performance as the models trained from scratch. We present the first results of out-of-context and out-of-domain foundation model training for hadronically decaying tau lepton reconstruction and show that the representation learned during pretraining can successfully be utilized for this multi-task reconstruction problem.Speaker: Laurits Tani (National Institute of Chemical Physics and Biophysics (EE)) -
101
OmniJet-alpha: foundation model updates
OmniJet-alpha, the first cross-task foundation model for particle physics, was first presented at ACAT 2024. In its base configuration, OmniJet-alpha is capable of transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging). Since its release, we have also shown that it can sucessfully transfer from CMS Open data to simulation, and even generate calorimeter showers. This talk will give an overview of the model, and present the latest developments and results.
Speaker: Anna Hallin (University of Hamburg) -
102
Multi-Modal track reconstruction using Graph Neural Networks at Belle II
Large backgrounds and detector aging impact the track finding in the Belle II central drift chamber, reducing both purity and efficiency in events. This necessitates the development of new track algorithms to mitigate detector performance degradation. Building on our previous success with an end-to-end multi-track reconstruction algorithm for the Belle II experiment at the SuperKEKB collider (arXiv:2411.13596), we have extended the algorithm to incorporate inputs from both the drift chamber and the silicon vertex tracking detector, creating a multi-modal network. We employ graph neural networks to handle the irregular detector structure and object condensation to address the unknown, varying number of particles in each event. This approach simultaneously identifies all tracks in an event and determines their respective parameters.
We have fully integrated this algorithm into the Belle II analysis software framework. Utilizing a realistic full detector simulation, which includes beam-induced backgrounds and detector noise derived from actual collision data, we report the performance of our track-finding algorithm across various event topologies compared to the existing baseline algorithm used in Belle II.Speaker: Lea Reuter (Karlsruhe Institute of Technology) -
103
CMS FlashSim: end-to-end simulation with ML
Detailed event simulation at the LHC is taking a large fraction of computing budget. CMS developed an end-to-end ML based simulation that can speed up the time for production of analysis samples of several orders of magnitude with a limited loss of accuracy. As the CMS experiment is adopting a common analysis level format, the NANOAOD, for a larger number of analyses, such an event representation is used as the target of this ultra fast simulation that we call FlashSim. Generator level events, from PYTHIA or other generators, are directly translated into NANOAOD events at several hundred Hz rate with FlashSim. We show how training FlashSim on a limited number of full simulation events is sufficient to achieve very good accuracy on larger datasets for processes not seen at training time. Comparisons with full simulation samples in some simplified benchmark analysis are also shown. With this work, we aim at establishing a new paradigm for LHC collision simulation workflows in view of HL-LHC.
Speakers: CMS Collaboration, Filippo Cattafesta (Scuola Normale Superiore & INFN Pisa (IT)) -
104
Computing the Matrix Element Method with generative machine learning
The Matrix Element Method (MEM) offers optimal statistical power for hypothesis testing in particle physics, but its application is hindered by the computationally intensive multi-dimensional integrals required to model detector effects. We present a novel approach that addresses this challenge by employing Transformers and generative machine learning (ML) models. Specifically, we utilize ML surrogates to efficiently sample parton-level events for numerical integration and to accurately encode the complex transfer functions describing detector reconstruction. We demonstrate this technique on the challenging ttH(bb) process in the semileptonic channel using the full CMS detector simulation. Furthermore, we extend the method to multiple processes relevant for double Higgs searches in the bbWW channel, constructing a comprehensive ML-based reconstruction surrogate across the entire analysis phase space. This advancement enables fully unbinned likelihood estimations of double Higgs Effective Field Theory (EFT) couplings directly from experimental data, with the potential of significantly enhancing sensitivity to new physics.
Speakers: CMS Collaboration, Dr Florian Bury (University of Bristol)
-
100
-
Track 3: Computations in Theoretical Physics: Techniques and Methods ESA C
ESA C
Conveners: chair: Tianji Cai, co-chair: Theo Heimel-
105
Towards differentiable Jet Clustering
Physics programs at future colliders cover a wide range of diverse topics and set high demands for precise event reconstruction. Recent analyses have stressed the importance of accurate jet clustering in events with low boost and high jet multiplicity. This contribution present how machine learning can be applied to jet clustering while taking desired properties such as infrared and collinear safety into account. In our contribution, we benchmark a score-based ML model on ZHH ($HH \rightarrow b\bar{b}b\bar{b}$) events, which are important for studies of the Higgs self-interaction. The results are compared regarding efficiency/purity and key physics metrics such as the dijet mass resolution with conventional algorithms such as Durham jet clustering, showing competitive performance. Furthermore, we present a possible extension for a fully differentiable model based on the Gumbel softmax.
Speaker: Bryan Bliewert (Deutsches Elektronen-Synchrotron (DE)) -
106
Towards AI-assisted particle theory design
The process of neutrino model building using flavor symmetries requires a physicist to select a group, determine field content, assign representations, construct the Lagrangian, calculate the mass matrices matrix, and perform statistical fits of the resulting free parameters. This process is constrained by the physicist's time and their intuition regarding mathematically complex groups, developed over years of experience. We develop an Autonomous Model Builder (AMBer), capable of performing all of these steps and finding elegant neutrino models using Reinforcement Learning (RL). AMBer is able to minimize the free parameters in the theory while maximizing compatibility with experimental observations. With AMBer, we can explore new groups for neutrino model building that have not been previous considered, and for which neutrino physicists have yet to build intuition, in a fraction of To make this computationally scalable, we re-designed a physics software package and trained AMBer on a supercomputer. We also design visualization tools to understand how AMBer explores the theory space. This serves as a blueprint for scaling AMBer to even more complex theory spaces and use more sophisticated physics software in the future.
Speaker: Aishik Ghosh (University of California Irvine (US)) -
107
Scaling laws for amplitude surrogates
Fast and precise evaluations of scattering amplitudes even
in the case of precision calculations is essential for event generation
tools at the HL-LHC. We explore the scaling behavior of the achievable
precision of neural networks in this regression problem for multiple
architectures, including a Lorentz symmetry aware multilayer perceptron
and the L-GATr architecture. L-GATr is equivariant with respect to the
Lorentz group by its internal embedding living in the geometric algebra
defined by the flat space-time metric. This study addresses in
particular the scaling behavior of uncertainty estimations using state
of the art methods.Speaker: Joaquin Iturriza Ramirez (Centre National de la Recherche Scientifique (FR)) -
108
Understandable ML-taggers
Modern ML-based taggers have become the gold standard at the LHC, outperforming classical algorithms. Beyond pure efficiency, we also seek controllable and interpretable algorithms. We explore how we can move beyond black-box performance and toward physically meaningful understanding of modern taggers. Using explainable AI methods, we can connect tagger outputs with well-known physics observables. This allows us to describe the full discriminative power of a modern ML-based tagger with only three learned observables.
Speaker: Sophia Vent
-
105
-
Poster session with coffee break: Group 1 ESA W 'West Wing'
ESA W 'West Wing'
-
Track 1: Computing Technology for Physics Research ESA M
ESA M
Conveners: chair: Sioni Summers, co-chair: Nicholas Smith-
109
Dynamic Control of Detectors, Polarized Beams, and Polarized Targets Using AI/ML
Stable operation of detectors, beams, and targets is crucial for reducing systematic errors and achieving high-precision measurements in accelerator-based experiments. Historically, this stability was achieved through extensive post-acquisition calibration and systematic data studies, as not all operational parameters could be precisely controlled in real time. However, recent advances in AI/ML are transforming this landscape by enabling dynamic, in-situ adjustments of equipment parameters during data acquisition.
At Jefferson Lab, a two-phase AI/ML program has been initiated to address these challenges. The first phase successfully deployed a system that dynamically adjusts the high voltage of a gaseous drift chamber detector to stabilize its gain—a solution that has been in production use for over a year. Building on this success, the second phase is now underway. This effort focuses on automating the continuous operation of a linearly polarized photon beam and the extraction of the polarization of a cryotarget used in fixed-target nuclear physics experiments.
A key innovation across both phases is the integration of uncertainty quantification within the AI/ML models, which provides not only accurate predictions but also confidence estimates that are critical for near real-time decision making and control. This presentation will highlight achievements from the drift chamber production system and discuss the current progress on automating beam and target operations, outlining the methodologies, challenges, and safety protocols that ensure robust performance in a dynamic experimental environment.
Speaker: David Lawrence -
110
Optimisation of the Accelerator Control by Reinforcement Learning: A Simulation-Based Approach
Abstract:
Optimizing control systems in particle accelerators presents significant challenges, often requiring extensive manual effort and expert knowledge. Traditional tuning methods are time-consuming and may struggle to navigate the complexity of modern beamline architectures. To address these challenges, we introduce a simulation-based framework that leverages Reinforcement Learning (RL) to enhance the control and optimization of beam transport systems. Built on top of the Elegant simulation engine, our Python-based platform automates the generation of simulations and transforms accelerator tuning tasks into RL environments with minimal user intervention. The framework features a modified Soft Actor-Critic agent enhanced with curriculum learning, enabling robust performance across a variety of beamline configurations. Designed with accessibility and flexibility in mind, the system can be deployed by non-experts and adapted to optimize virtually any beamline. Early results demonstrate successful application across multiple simulated beamlines, validating the approach and offering promising potential for broader adoption. We continue to refine the framework toward a general-purpose solution—one that can serve both as an intelligent co-pilot for physicists and a testbed for RL researchers developing new algorithms. This work highlights the growing synergy between AI and accelerator physics, and the critical role of computational innovation in advancing experimental capabilities.Speakers: Anwar Ibrahim, Prof. fedor. ratnikov -
111
Towards a Natural Language User Experience of the ATLAS Technical Coordination Expert System Utilizing Large Language Models
The ATLAS detector at CERN and its supporting infrastructure form a highly complex system. It covers numerous interdependent sub-systems and requires collaboration across a team of multi-disciplinary experts. The ATLAS Technical Coordination Expert System provides an interactive description of the technical infrastructure and enhances its understanding. It features tools to assess the impact of interventions and document expert knowledge. However, the detector’s complexity is inevitably reflected in the expert system. This can complicate information retrieval and can diminish the user experience.
This submission presents a proposal to improve the Expert System’s user experience with natural language-based interfaces. A large language model is used to interpret user queries, and return the responses as written texts. This paper discusses the key aspects of the system's architecture and model selection, the process for building a domain-specific knowledge base, and evaluating the improved user experience from an initial pilot. The pilot is based on the results of a survey conducted across a range of users of the tool, that defined the expected interactions with the new system. The pilot interface is evaluated based on factual correctness, task success and task execution time. It is expected that this approach will improve the tool’s efficiency for regular users and ultimately increment its usage and impact within the ATLAS collaboration.
Speaker: Gustavo Uribe (Universidad Antonio Narino (CO)) -
112
Using Large Language Models to Accelerate Access to Physics
In high energy physics, most AI/ML efforts focus on improving the scientific process itself — modeling, classification, reconstruction, and simulation. In contrast, we explore how Large Language Models (LLMs) can accelerate access to the physics by assisting with the broader ecosystem of work that surrounds and enables scientific discovery. This includes understanding complex documentation, navigating experimental frameworks, writing and debugging code, generating visualizations, and interacting with APIs and databases.
We present an evolving suite of tools that apply LLMs in support roles across this ecosystem. Building on prior work using Retrieval-Augmented Generation (RAG) to query the large particle physics scientific collections, we now incorporate advances such as the Model Context Protocol for better grounding, agent-based orchestration for multi-step reasoning and task execution, entity scanning for rapid information extraction, and live code generation tailored to experimental workflows. Our LLM-driven systems no longer operate as isolated tools but as participants in an integrated physics infrastructure.
We report on system design, model performance, and real-world use cases, and we outline open challenges around evaluation, and reliability. Many pieces and communication APIs are being standardized, but a great deal of work remains to make these useful in particle physics. By focusing on how LLMs can streamline the path to doing physics, this work explores this space of AI assistance.
Speaker: Gordon Watts (University of Washington (US))
-
109
-
Track 2: Data Analysis - Algorithms and Tools ESA B
ESA B
Conveners: chair: Frank Gaede, co-chair: Luisa Lucie-Smith-
113
Interaction-Aware and Domain-Invariant Representation Learning for Inclusive Flavour Tagging
Measurements of neutral, oscillating mesons are a gateway to quantum mechanics and give access to the fundamental interactions of elementary particles. For example, precise measurements of $CP$ violation in neutral $B$ mesons can be taken in order to test the Standard Model of particle physics. These measurements require knowledge of the $B$-meson flavour at the time of its production, which cannot be inferred from its observed decay products. Therefore, multiple collider experiments employ machine learning-based algorithms, so-called flavour taggers, to exploit particles that are produced in the proton-proton interaction and are associated with the signal $B$ meson to predict the initial $B$ flavour. A state-of-the-art approach to flavour tagging is the inclusive evaluation of all reconstructed tracks from the proton-proton interaction using a Deep Set neural network.
Flavour taggers are desired to achieve optimal performance for data recorded from proton-proton interactions while being trained with a labelled data sample, i.e., with Monte Carlo simulations. However, the limited knowledge of QCD processes introduces inherent differences between simulation and recorded data, especially in the quark-fragmentation processes that are relevant for flavour tagging. Existing flavour taggers neither model these differences nor do they model interactions between tracks explicitly, being at danger of overfitting to simulations, of not providing optimal performance for physics analyses, and of requiring a careful calibration on data.
We present an inclusive flavour tagger that builds on set transformers (to model particle interactions via set attention) and on domain-adversarial training (to mitigate differences between data sources). These foundations allow the tagger to learn intermediate data representations that are both interaction-aware and domain-invariant, i.e., they capture the interactions between tracks and do not allow for an overfitting to simulations. In our benchmark, we increase the statistical power of flavour-tagged samples by 10% with respect to the usage of deep sets, thus demonstrating the value of interaction-aware and domain-invariant representation learning.
Speaker: Quentin Führing (Technische Universitaet Dortmund (DE), University of Cambridge (UK)) -
114
How to make any Network Lorentz-Equivariant
We construct Lorentz-equivariant transformer and graph networks using the concept of local canonicalization. While many Lorentz-equivariant architectures use specialized layers, this approach allows to take any existing non-equivariant architecture and make it Lorentz-equivariant using transformations with equivariantly predicted local frames. In addition, data augmentation emerges as a special case of this approach, allowing us to directly compare data augmentation with equivariant models. We use the task-specific non-equivariant architectures in amplitude regression and jet tagging to benchmark local canonicalization.
Speaker: Sebastian Pitz (ITP, Heidelberg University) -
115
Hyperparameter Transfer for Graph Transformers
Modern machine learning (ML) algorithms are sensitive to the specification of non-trainable parameters called hyperparameters (e.g., learning rate or weight decay). Without guiding principles, hyperparameter optimization is the computationally expensive process of sweeping over various model sizes and, at each, re-training the model over a grid of hyperparameter settings. However, recent progress from the ML theory community has given a prescription for scaling hyperparameters with respect to model size such that (1) the optimal hyperparameters identified for small models of a fixed architecture are the same for their larger counterparts (hyperparameter transfer) and (2) larger models perform better than their smaller counterparts (limiting behavior). When satisfied, these desiderata yield large computational savings and stable performance useful for computing, for example, neural scaling laws. In this talk, we will present a recipe for achieving hyperparameter transfer and limiting behavior in graph transformers, transformer variants combining simple message passing with sparse attention computed over the edges of each input graph. Though relatively new, graph transformers have been shown to outperform simple GNNs and transformers on a variety of benchmark tasks, and have particular relevance to scientific datasets where edges may encode known physical interactions and measurements. We will demonstrate the promise of these principled graph transformers on benchmark datasets and encourage discussion about how these results may be extended to tackle more challenging scenarios in particle physics.
Speaker: Gage DeZoort (Princeton University (US)) -
116
Evaluating Two-Sample Tests for Validating Generators in Precision Sciences
Deep generative models have become powerful tools for alleviating the computational burden of traditional Monte Carlo generators in producing high-dimensional synthetic data. However, validating these models remains challenging, especially in scientific domains requiring high precision, such as particle physics. Two-sample hypothesis testing offers a principled framework to address this task. We propose a robust methodology to assess the performance and computational efficiency of various metrics for two-sample testing, with a focus on high-dimensional datasets. Our study examines tests based on univariate integral probability measures, namely the sliced Wasserstein distance, the mean of the Kolmogorov-Smirnov statistics, and the sliced Kolmogorov-Smirnov statistic. Additionally, we consider the unbiased Fréchet Gaussian Distance and the Maximum Mean Discrepancy. Finally, we include the New Physics Learning Machine, an efficient classifier-based test leveraging kernel methods. Experiments on both synthetic and realistic data show that one-dimensional projection-based tests demonstrate good sensitivity with a low computational cost. In contrast, the classifier-based test offers higher sensitivity at the expense of greater computational demands.
This analysis provides valuable guidance for selecting the appropriate approach—whether prioritizing efficiency or accuracy. More broadly, our methodology provides a standardized and efficient framework for model comparison and serves as a benchmark for evaluating other two-sample tests.Speaker: Samuele Grossi (Università degli studi di Genova & INFN sezione di Genova)
-
113
-
Track 3: Computations in Theoretical Physics: Techniques and Methods ESA C
ESA C
Conveners: chair: Chiara Signorile, co-chair: Ramon Winterhalder-
117
Scattering Amplitudes: A New Playground for Machine Learning
AI for fundamental physics is now a burgeoning field, with numerous efforts pushing the boundaries of experimental and theoretical physics, as well as machine learning research itself. In this talk, I will introduce a recent innovative application of Natural Language Processing to the state-of-the-art precision calculations in high energy particle physics. Specifically, we use Transformers to predict symbolic mathematical expressions that represent scattering amplitudes in planar N=4 Super Yang-Mills theory—a quantum field theory closely related to the real-world QCD at the Large Hadron Collider. Our first results have demonstrated great promises of Transformers for amplitude calculations, while its major challenges are being addressed by ongoing work. This study opens the door for an exciting new scientific paradigm where discoveries and human insights are inspired and aided by an AI agent.
Speaker: Tianji Cai -
118
Calibrating ATLAS calorimeter signals using an uncertainty-aware precision network
ATLAS explores modern neural networks for a multi-dimensional calibration of its calorimeter signal defined by clusters of topologically connected cells (topo-clusters). The Bayesian neural network (BNN) approach yields a continuous and smooth calibration function, including uncertainties on the calibrated energy per topo-cluster. In this talk the performance of this BNN-derived calibration is compared to an earlier calibration network and standard table-lookup-based calibrations. The BNN uncertainties are confirmed using repulsive ensembles and validated through the pull distributions. First results indicate that unexpectedly large learned uncertainties can be linked to particular detector regions.
Speaker: Lorenz Vogel (ITP, Heidelberg University) -
119
Learning Reliable Uncertainties - The Return of the Ensemble
Neural networks for LHC physics must be accurate, reliable, and well-controlled. This requires them to provide both precise predictions and reliable quantification of uncertainties - including those arising from the network itself or the training data. Bayesian networks or (repulsive) ensembles provide frameworks that enable learning systematic and statistical uncertainties. We investigate different aspects of repulsive ensembles: the dependence on the repulsive kernel, biases for small training datasets, and systematic uncertainty estimation for the overall ensemble.
Speaker: Nina Elmer (Heidelberg University) -
120
BAyesian Neural Network: an Atomic and Nuclear Emulator (BANNANE)
One of the main goals of theoretical nuclear physics is to provide a first-principles description of the atomic nucleus, starting from interactions between nucleons (protons and neutrons). Although exciting progress has been made in recent years thanks to the development of many-body methods and nucleon-nucleon interactions derived from chiral effective field theory, performing accurate many-body calculations with quantifiable uncertainties remains a major challenge.
To address these problems, we use ab initio many-body calculations in combination with a hierarchical Bayesian neural network to develop emulators that accurately predict nuclear properties, vastly reducing computational time and enabling robust uncertainty quantification.
As a benchmark for our developments, we present results on the ground-state properties of complex nuclei, surpassing traditional surrogates where data is available and enabling predictions of nuclear properties and nuclear matter at extreme proton-to-neutron ratios, where experiments are expected in the coming years.Speaker: Jose Miguel Munoz Arias (MIT)
-
117
-
-
-
Plenary ESA A
ESA A
Convener: chair: Daniel Maitre-
121
Updates from organizersSpeaker: Gregor Kasieczka (Hamburg University (DE))
- 122
-
123
Optimising the CMS GPU Reconstruction: scheduling, efficiency, stabilitySpeaker: Dr Andrea Bocci (CERN)
-
124
The FAIR Universe ProjectSpeaker: Sascha Diefenbacher (Lawrence Berkeley National Lab. (US))
-
121
-
Poster session with coffee break: Group 2 ESA W 'West Wing'
ESA W 'West Wing'
-
125
A Library for ML-based Fast Calorimeter Shower Simulation at Future Collider Experiments and Beyond
Simulation plays an essential role in modern high energy physics experiments. However, the simulation of particle showers in the calorimeter systems of detectors with traditional Monte Carlo procedures represents a major computational bottleneck, and this subdetector system has long been the focus of fast simulation efforts. More recently, approaches based on deep generative models have shown promise in providing accurate surrogate simulators while delivering significant reductions in compute times.
While a broad range of generative models have been applied to this task in the literature, significantly less attention has been given to incorporating them into the existing software ecosystems. While this is essential for a model to eventually be deployed in a production environment, it also provides a means of evaluating the physics performance of a model after reconstruction. Such a development therefore provides access to a new suite of metrics, which ultimately determine a model’s suitability as a fast simulation tool.
In this contribution we describe DDFastShowerML, a library now available in Key4hep. This generic library provides a means of combining inference of generative models trained to simulate calorimeter showers with the DD4hep toolkit, using the fast simulation hooks that exist in Geant4. This allows a seamless combination of full and fast simulation, making it possible to run fast ML inference in the full simulation of experiments with detector geometries featuring realistic levels of detail, followed by standard reconstruction algorithms. This makes it possible to benchmark generative models with realistic physics analyses and is a prerequisite for eventually using them in an experiment’s Monte Carlo production chain. The flexibility of the library will be demonstrated through examples of different models that have been integrated, and different detector geometries that have been studied. A summary of future development directions will also be given.
Speaker: Peter McKeown (CERN) -
126
Abstracting heterogeneous resources in the ALICE Grid
With the emergence of increasingly complex workflows and data rates, accelerators have gained importance within ALICE and the Worldwide LHC Computing Grid (WLCG). Consequently, support for GPUs was added to JAliEn, the ALICE Grid middleware, in a transparent manner to automatically use these resources when available -- without breaking existing mechanisms for payload isolation and compatibility.
The above support has up to now been limited to the ALICE Event Processing Nodes (EPNs), as driver restrictions and hardware variations may prevent the pilot from enabling GPU support when execution environments stray too far from the current norm. Furthermore, even when enabled, each Grid payload is ultimately tailored to a specific GPU model, and necessitates additional optimizations when deployed on a different one. With the ever increasing amounts of data, and HL-LHC on the horizon, being able to offload GPU workflows to additional clusters in the ALICE Grid becomes a priority.
This contribution examines how GPU support can be extended to other computing sites in the ALICE Grid, such as the Perlmutter HPC at NERSC, in the context of being able to run ALICE reconstruction workflows -- expanding support beyond the existing ALICE EPN cluster.
Speaker: Maksim Melnik Storetvedt (Western Norway University of Applied Sciences (NO)) -
127
AI-assisted analysis to enhance discovery potential in High-Energy Physics
Unsupervised anomaly detection has become a pivotal technique for model-independent searches for new physics at the LHC. In high-energy physics (HEP), anomaly detection is employed to identify rare, outlier events in collision data that deviate significantly from expected distributions. A promising approach is the application of generative machine learning models, which can efficiently detect such deviations without requiring labeled data.
In this study, we develop a Transformer-based reconstruction model, trained exclusively on Standard Model (SM) background data, to identify events that exhibit significant deviations. The method is applied to ATLAS Open Data from Run 2 (2015–2016), focusing on the identification of rare and potential Beyond the Standard Model (BSM) processes. Our architecture utilizes a modified Transformer, optimized to handle high-dimensional tabular input, comprising low-level physics observables, such as jet kinematics, lepton and photon energy, MET, electromagnetic and hadronic calorimeter energy deposits, as well as event topology variables.
The Transformer model is trained to learn the inherent patterns in SM background data, effectively modeling the normal event distributions. We use a Tab-Transformer with weighted loss, which captures the intricate relationships within the background data. When the trained model is tested on rare and BSM Monte Carlo (MC) samples (e.g., SUSY, Exotic), it exhibits excellent reconstruction performance for background events while generating large reconstruction losses for anomalous events. This ability to identify outliers is crucial for anomaly detection in HEP.
Compared to conventional Variational Autoencoders (VAEs), our Transformer-based architecture demonstrates superior background modeling, with enhanced sensitivity to anomalies. The method operates directly on low-level physics observables, making it highly interpretable and scalable. Additionally, it allows for searches in pre-selection regions without introducing biases from selection cuts, offering a more flexible approach to identifying new physics. We are also planning to extend the analysis using more detector-level observables to further improve the sensitivity and scalability of the method.
Speaker: ASRITH KRISHNA RADHAKRISHNAN (Universita e INFN, Bologna (IT)) -
128
An nginx-based Content Distribution Network for HEP
With the move to HTTP/WebDAV and JSON Web Tokens as a standard protocol for transfers within the WLCG distributed storage network, a large amount of off-the-shelf technologies become viable for meeting the requirements of a Storage Element (SE). In this work, we explore the capabilities and performance of the OpenResty framework, which extends the nginx server with the LuaJIT scripting language, to recreate the feature set of a SE. We demonstrate token-authenticated HTTP read, write, and WebDAV third-party copy features, as well as a storage federation with HTTP redirect, proxy, and caching capabilities. We further explore the performance scaling in terms of throughput and requests per second.
Speaker: Nick Smith (Fermi National Accelerator Lab. (US)) -
129
AUDITOR - the accounting ecosystem for HL-LHC and other accounting challenges
In the field of High Throughput Computing (HTC), the management and processing of large volumes of accounting data across different environments and use cases is a significant challenge. AUDITOR addresses this issue by providing a flexible framework for building accounting pipelines that can be adapted to a wide range of needs.
At its core, AUDITOR serves as a centralised storage solution for accounting records, facilitating data exchange through a REST interface. This enables seamless interaction with the other parts of the AUDITOR ecosystem: the collectors, which gather accounting data from various sources and push it to AUDITOR, and the plugins, which pull data from AUDITOR for subsequent processing. The modular nature of AUDITOR allows for the customisation of collectors and plugins to suit specific use cases and environments, ensuring a tailored approach to accounting data management. Future use cases that could be realised with AUDITOR include the accounting of GPU resources or the accounting of variable core power values of compute nodes due to dynamic adjustments of the CPU clock frequency and the evaluation of the created CO2 footprint.
This presentation will outline the structure of the AUDITOR accounting ecosystem, demonstrate existing accounting pipelines, and show how AUDITOR could be extended to account for environmentally sustainable computing resources.Speaker: Raghuvar Vijayakumar (University of Freiburg (DE)) -
130
Building a High-Availability and User-Friendly Neutron Scattering Data Computing Infrastructure
To address the urgent need for efficient data analysis platforms in the neutron scattering field, this report presents a cloud-based computing infrastructure solution based on the technical architecture of OpenStack and WebRTC. Based on this infrastructure, a deeply integrated system for data management and storage is constructed to provide researchers with a one-stop analysis platform that integrates data processing, tool invocation, and resource scheduling.
Aiming at the pain points in traditional data analysis workflows of the China Spallation Neutron Source (CSNS), such as decentralized data management, complex analysis workflows, and inadequate system scalability, the platform builds a high-availability computing cluster using OpenStack. Through standardized images pre-installed with professional tools (e.g., Mantid, SasView), it enables dynamic creation and intelligent scheduling of computing resources. The platform integrates unified authentication and establishes a user permission management mechanism to support data mounting strategies and secure access control based on user roles.
In terms of interactive layer innovation, to solve issues such as high-latency interaction and lack of functional components (e.g., file transfer, clipboard sharing) in traditional NoVNC remote desktops, the real-time communication technology WebRTC is innovatively introduced. This technology allows users to access interactive analysis environments directly via browsers without installing clients. Relying on its native low-latency data transmission protocols and adaptive encoding technology, it significantly improves the smoothness of remote desktop operations while enabling enhanced functions such as high-resolution display and cross-platform copy-paste, creating an immersive user experience similar to a local environment.
The platform deeply integrates core functional modules including elastic scheduling of computing resources, data security isolation, and seamless browser-based access, effectively resolving the efficiency bottlenecks of traditional analysis workflows. It provides high-availability and highly interactive cloud-based analysis services for neutron scattering research, promoting the intelligent and collaborative development of experimental data processing in this field.Speaker: Mr Binbin Lee (Institute of High Energy Physics) -
131
BuSca: A new software project for LLP searches at 30 MHz at LHCb
The new fully software-based trigger of the LHCb experiment operates at a 30 MHz data rate, opening a search window into previously unexplored regions of physics phase space. The BuSca (Buffer Scanner) project at LHCb acquires, reconstructs and analyzes data in real time, extending sensitivity to new lifetimes and mass ranges though the recently deployed Downstream tracking algorithm. BuSca identifies hotspots indicative of potential new particle candidates in a model-independent manner, providing strategic guidance for developing new trigger requirements. To control the background, regions with minimal detector material interactions are selected, and pairs of same-sign tracks are used to suppress combinatorial background. This talk presents the results from the analysis of the first data.
Speaker: Valerii Kholoimov (Instituto de Física Corpuscular (Univ. of Valencia)) -
132
CaloClouds3; Diffusion and normalising flows
This contribution presents the final iteration of the CaloClouds series. Simulation of photon showers in the granularities expected in a future Higgs factory is computationally challenging. A viable simulation must capture the find details exposed by such a detector, while also being fast enough to keep pace with the expected rate of observations. The Caloclouds model utilises point cloud diffusion and normalising flows to replicate MCMC simulation with exceptional actuary. First we will make a lightning overview of the models objectives and constraints. To describe the upgrades for the latest version, we detail the studies on the flow model and the optimisations made, and then summarise the steps taken to generalise CaloClouds 3 for using in the whole detector. Considering some of the underlying principles of model design, we look at the significance of the data format choice on model outcomes. Finally, we present the results of reconstructions performed on CaloClouds 3 output against the results from Geant4 simulation, thus demonstrating that this model provides reliable physics reproductions.
Speaker: Henry Day-Hall (DESY) -
133
CNN-Based PID Algorithms for STCF Cherenkov Detectors
The Super Tau Charm Facility (STCF) is a next-generation electron-positron collider proposed in China, operating at a center-of-mass energy of 2–7 GeV with a peak luminosity of 0.5×10³⁵ cm⁻²s⁻¹. In STCF experiments, the identification of high-momentum charged hadrons is critical for physics studies, driving the implementation of a dedicated particle identification (PID) system that combines two Cherenkov detection technologies: a time-of-flight detector based on internally reflected Cherenkov light (DTOF) and a ring imaging Cherenkov detector (RICH), with BTOF serving as a backup for RICH.
Recent advancements in deep learning allow end-to-end learning directly from raw detector responses. This study develops convolutional neural network (CNN)-based PID algorithms for all three PID detectors (DTOF, RICH, and BTOF) that transform the hit patterns of Cherenkov photons on photomultiplier tubes into 2D images as the primary input features while incorporating kinematic information from the tracking system for learning, thereby enabling direct prediction of different particle probabilities. Preliminary results demonstrate that this CNN model, integrating image patterns with kinematic information, achieves excellent PID capability, providing a promising solution for high-precision PID at STCF.Speaker: Zhipeng Yao -
134
Design and implementation of JUNO Keep-Up Production pipeline
The Jiangmen Underground Neutrino Observatory (JUNO) is a multipurpose neutrino experiment with the primary goals of the determining the neutrino mass ordering and precisely measuring oscillation parameters. The JUNO detector construction was completed at the end of 2024. It generate about 3 petabytes of data annually, requiring extensive offline processing. This processing, which is called Keep-Up Production, typically involves multiple steps, such as preprocessing, calibration, reconstruction and data analysis. Automating this pipeline significantly enhances the efficiency of data production.
This contribution presents the design, implementation and application of JUNO Keep-Up Production pipeline that leverages Apache Kafka for inter-step communicate via messaging. This approach decouples the various steps, with each message containing metadata for a file, such as its name and path. When a task at one step is completed, a message is sent to a topic, notifying the subsequent step to initiate a new task.
To handle the distinct commands required at each step, a YAML-based job management tool has been developed. The tool comprises four microservices: an API server, a job creator, a job submitter and a job monitor. The job creator can be configured to generate jobs for individual files or batches. For processing a batch of files, the job creator caches files until the file list meets the specified requirements, allowing for flexible run-by-run data processing. Once a job is created, its information is registered with the API server, from which the job submitter retrieves and submits tasks. The job monitor tracks job statuses and, upon completion, generates a new message to trigger the next processing topic. Docker Compose is employed to create instances for these steps.
Finally, this contribution demonstrates the successful application of the Keep-Up Production in the JUNO experiment.
Speaker: Tao Lin (Chinese Academy of Sciences (CN)) -
135
Differentiable Optimization of Muon Scattering Tomography Detector Design for Border Control Applications
Recent years have seen growing interest in leveraging secondary cosmic ray muons for tomographic imaging of large and unknown volumes. A key area of application is cargo scanning for border security, where muon tomography is used to detect concealed hazardous or illicit materials in trucks and shipping containers. We present recent developments in TomOpt, a Python-based, end-to-end software framework for optimizing muon scattering tomography systems. Current work on TomOpt is specifically focused on advancing its capabilities for cargo scanning detector applications.
Speaker: Zahraa Zaher -
136
Efficient data movement for Machine Learning inference in heterogeneous CMS software
Efficient data processing using machine learning relies on heterogeneous computing approaches, but optimizing input and output data movements remains a challenge. In GPU-based workflows data already resides on GPU memory, but machine learning models requires the input and output data to be provided in specific tensor format, often requiring unnecessary copying outside of the GPU device and conversion steps. To address this, we present an interface that allows seamless conversion of Structure of Arrays (SoA) data into lists of PyTorch tensors without explicit data movement. Our approach computes the necessary strides for various data types, including scalars and rows of vectors, matrices, allowing PyTorch tensors to directly access the data on the GPU memory. The introduced metadata structure provides a flexible mechanism for defining the columns to be used and specifying the order of the resulting tensor list. This user-friendly interface minimizes the amount of code required, allowing direct integration with machine learning models. Implemented within the CMS computing framework and using the Alpaka library for heterogeneous applications, this solution significantly improves GPU efficiency. By avoiding unnecessary CPU-GPU transfers, it accelerates model execution while maintaining flexibility and ease of use.
Speakers: CMS Collaboration, Christine Zeh (Vienna University of Technology (AT)) -
137
Efficient Point Transformer for Charge Particles Track Reconstruction
Charge particle track reconstruction is the foundation of the collider experiments. Yet, it's also the most computationally expensive part of the particle reconstruction. The innovation in tracking reconstruction with graph neural networks (GNNs) has shown the promising capability to cope with the computing challenges posed by the High-Luminosity LHC (HL-LHC) with Machine learning. However, GNNs face limitations involving irregular computations and random memory access, slowing down their speed. In this talk, we introduce a Locality-Sensitive Hashing-Based Efficient Point Transformer (HEPT) with advanced attention methods as a superior alternative with near-linear complexity, achieving milliseconds latency and memory consumption. We present a comprehensive evaluation of HEPT's computational efficiency and physics performance compared to other algorithms, such as GNN-based pipelines, highlighting its potential to revolutionize full track reconstruction.
Speaker: Yuan-Tang Chou (University of Washington (US)) -
138
Efficient TrackML Data Access Using HDF5 for Scalable Particle Tracking
The TrackML dataset, a benchmark for particle tracking algorithms in High-Energy Physics (HEP), presents challenges in data handling due to its large size and complex structure. In this study, we explore using a heterogeneous graph structure combined with the Hierarchical Data Format version 5 (HDF5) not only to efficiently store and retrieve TrackML data but also to speed up the training and inference of the Graph Neural Network (GNN) models used for tracking.
We reorganize the TrackML dataset into a heterogeneous graph structure using PyTorch Geometric (PyG) to represent better the complex relationships in tracking detector data. In this representation, hit and track entities are modeled as distinct node types, with multiple edge types capturing interactions such as hit-hit spatial connections and hit-track associations. This heterogeneous structure enables more expressive GNN architectures that can leverage semantic information across node and edge types, leading to improved modeling of tracking behavior and enhanced flexibility for multi-relational learning tasks.
The conversion of TrackML CSV files to HDF5 enables rapid, scalable access to event-based particle tracking information while maintaining data integrity and structure. The HDF5 format significantly improves read speed, storage efficiency, and ease of data manipulation. The implementation supports fast indexing, event filtering, and compatibility with parallel processing workflows, which are critical for machine learning applications in particle physics. Benchmark results show compression gains and faster read performance than standard CSV and PyG parsing. This approach facilitates more efficient experimentation and prototyping in TrackML-based research and can be extended to other large-scale physics datasets.
Speaker: Alina Lazar (Youngstown State University (US)) -
139
Efficient Transformers for Jet Tagging
We present a suite of optimizations to the Particle Transformer (ParT), a state-of-the-art model for jet tagging, targeting the stringent latency and memory constraints of real-time environments such as HL-LHC triggers. To address the quadratic scaling and compute bottlenecks of standard attention, we integrate FlashAttention for exact, fused-kernel attention with reduced memory I/O, and Linformer to lower attention complexity from O(n²) to O(n) via low-dimensional projections—substantially improving scalability for longer sequences. We further apply INT8 dynamic quantization to compress matrix multiplications, reducing latency and GPU memory usage without retraining. Evaluations on JetClass and HLS4ML datasets show that these techniques—individually and in combination—deliver significant inference speedups, FLOP reductions, and memory savings while maintaining near-baseline accuracy. Additional experiments explore sequence ordering strategies, including physics-motivated projection matrices, and employ interpretability analyses of attention maps and embeddings to better understand model behavior. The combined approach enables efficient, accurate transformer-based jet classification suitable for high-rate trigger systems.
Speaker: Vivekanand Gyanchand Sahu (University of California San Diego) -
140
Energy Flow Polynomials for More Model-Agnostic Anomaly Detection
Weakly supervised anomaly detection has been shown to be a sensitive and robust tool for Large Hadron Collider (LHC) analysis. The effectiveness of these methods relies heavily on the input features of the classifier, influencing both model coverage and the detection of low signal cross sections. In this talk, we demonstrate that improvements in both areas can be achieved by using energy flow polynomials. To further highlight this, we introduce new benchmark signals for the LHCO RnD dataset, which is a widely used benchmark dataset in this field.
Speaker: Lukas Lang (RWTH Aachen University) -
141
Error Analysis of PanDA metadata
The Production and Distributed Analysis (PanDA) workload management system was designed with flexibility to adapt to emerging computing technologies in processing, storage, networking, and distributed computing middleware for the global data distribution. PanDA can coordinate processing over heterogeneous computing resources, including dozens of geographically separated high-performance computers. Error occurrence is a common phenomenon in the whole operation that arises through different sources, e.g. human intervention, hardware faults, data transfer issues. Ensuring resilience of the voluminous data is critical during the data life cycle and throughout the execution of scalable workflows to assure that the pathway to viable scientific results is not impeded.
One of the primary goals is to understand, analyze and mitigate error occurrence to ensure the resiliency of the workflow management. We analyzed five months of PanDA metadata related to tasks and jobs. We performed a detailed analysis which gives us a deeper understanding of the error occurrence and patterns and hence developing mitigation strategies. We also categorized all the types of error occurrence. We also studied the impact of such failure on overall system resources, for example. computing hours wastage, memory allocated, etc. We also performed similar analysis at the task metadata as well. Through analysis we developed a time series analysis of the errors and then developed a predictive model for future error occurrence. As a primary challenge, understanding of the field definition is primarily important which is now sparse and not well defined. Even if the definitions are clear, mitigating the interplay of the fields and reducing redundancy is also challenging, especially considering the data volume. Similar challenges exist for the task data as well.Speakers: Raees Ahmad Khan (University of Pittsburgh (US)), Ms Tania Korchuganova -
142
Evaluating HEP Workflow Portability and Performance using Liquid Argon TPC (LAr TPC) detector simulations Across HPC Systems
This study evaluates the portability, performance, and adaptability of the Liquid Argon TPC (LAr TPC) detector simulations on different HPC platforms, specifically Polaris, Frontier, and Perlmutter. Lar TCP workflow is a computationally complex workflow which mimics neutrino interactions and the resultant detector responses in a modular liquid argon TPC, integrating various subsystems to validate design choices and refine analysis tools. Computaional complexity comes from integrating multiple high-fidelity simulation modules, reconstruction algorithms, and real-time calibration processes across heterogeneous and massive data streams using parallel processing techniques. We explore the diverse challenges of deploying the simulation workflow, noting that the issues encountered vary significantly across different platforms. For example, we investigate how constrained network conditions on Polaris impact software distribution technologies such as CVMFS and container solutions; on Frontier, we studied unique complications arising from CUDA-specific libraries, incompatible with its AMD GPUs. We also describe effective mitigating strategies: to address the issues on Polaris we used portable solutions such as CVMFSExec, proxies (e.g., Squid), and container technologies (e.g., Singularity); on Frontier we investigate CPU-only execution strategies and explore alternative libraries such as HIP and CuPy. Conversely, deploying on Perlmutter using the Superfacility API proved more straightforward, highlighting the potential of standardized HPC APIs to simplify workflow management.
Our experiences further highlight the value of closely coordinating the development of facility-specific APIs, such as NERSC's Superfacility API, Globus Compute and the Integrated Research Infrastructure (IRI) initiative, alongside scientific workflows. We discuss the benefits of evolving these APIs based on the real-world challenges and practical demands encountered in complex workflows. By generalizing these issues beyond the specific (LAr TPC) context, we emphasize adaptable strategies such as container overlays, environment bridging scripts, and comprehensive dependency documentation to achieve sustainable and facility-independent scientific workflows.
Speaker: Doug Benjamin (Brookhaven National Laboratory (US)) -
143
Evolution of data structures for heterogeneous reconstruction in CMSSW
The Next Generation Trigger project aims to improve the computational efficiency of the CMS reconstruction software (CMSSW) to increase the data processing throughput at the High-Luminosity Large Hadron Collider. As part of this project, this work focuses on improving the common Structure of Arrays (SoA) used in CMSSW for running both on CPUs and GPUs. We introduce a new SoA feature that allows users to selectively prune and combine columns across one or more existing SoAs into a new view, while preserving a user-friendly interface. It is also possible to consolidate these columns into a new SoA object, performing heterogeneous memory copies as needed. This process uses the Alpaka library for optimizing data transfer across different computing architectures, reducing overhead, and improving efficiency. Another new feature introduces the possibility of generating custom methods for SoA elements, enhancing flexibility and expressiveness in data manipulation. The design prioritizes ease of use, allowing users to interact with the data intuitively while benefiting from an efficient underlying implementation. The impact of these optimizations, along with performance measurements, will be presented.
Speaker: Leonardo Beltrame (Politecnico di Milano (IT)) -
144
Exploring new directions in enhancing the ACTS parameter optimization suite
Particle tracking is among the most sophisticated and complex parts of the full event reconstruction chain. Various reconstruction algorithms work in sequence to build trajectories from detector hits. Each of these algorithms requires numerous configuration parameters that need fine-tuning to properly account for the detector/experimental setup, the available CPU budget, and the desired physics performance. To automate and optimize the tuning of these parameters, automatic parameter optimization techniques were implemented in “A Common Tracking Software” (ACTS) framework, the open-source track reconstruction software framework. These techniques allow users to flexibly choose tunable parameters and define a cost/benefit function for optimizing the full reconstruction chain. Since their implementation, these techniques have been greatly beneficial for researchers across various experiments.
The current study discusses ongoing advancements in these optimization techniques, including novel approaches that enable a more systematic and gradual refinement of parameter tuning. Specifically, I will explore the integration of Bayesian Optimization techniques for tracking algorithm tuning, highlighting their potential to improve efficiency and precision beyond existing methods.Speakers: Chance Alan Lavoie (Carnegie Mellon University), Chance Lavoie, Rocky Bala Garg (Stanford University (US)) -
145
Extrapolating Jet Radiation with Autoregressive Transformers
Generative networks are an exciting tool for fast LHC event generation. Usually, they
are used to generate configurations with a fixed number of particles. Autoregressive
transformers allow us to generate events with variable numbers of particles, very much
in line with the physics of QCD jet radiation. We show how they can learn a factorized
likelihood for jet radiation and extrapolate in terms of the number of generated jets. For
this extrapolation, bootstrapping training data and training with modifications of the
likelihood loss can be used. Beyond particle physics applications, our studies show that autoregressive transformers can extrapolate.Speaker: Jonas Spinner -
146
Fair Universe HiggsML Uncertainty Challenge
Measurements and observations in Particle Physics fundamentally depend on one's ability to quantify their uncertainty and, thereby, their significance. Therefore, as Machine Learning methods become more prevalent in HEP, being able to determine the uncertainties of an ML method becomes more important. A wide range of possible approaches has been proposed, however, there has not been a comprehensive comparison of individual methods.
To address this, the Fair Universe project organized the HiggsML Uncertainty Challenge, which took place from Sep 2024 to 14th March 2025, and the dataset and performance metrics of the challenge will serve as a permanent benchmark for further developments. Additionally, the Challenge was accepted as an official NeurIPS2024 competition. The goal of the challenge was to measure the Higgs to tau+ tau- cross-section, using a dataset of particle 4-momenta. Participants were evaluated on both their ability to precisely determine the correct cross-section, as well as on their ability to report correct and well-calibrated uncertainty intervals.
In this talk, we present an overview of the competition itself and of the infrastructure that underpins it. Further, we present the winners of the competition and discuss the performance of their winning uncertainty quantification approaches.
The challenge itself can be found under https://www.codabench.org/competitions/2977/
And more details are available as https://arxiv.org/abs/2410.02867Speaker: David Rousseau (IJCLab-Orsay) -
147
Fast FARICH Simulation Using Generative Adversarial Networks
In the end-cap region of the SPD detector complex, particle identification will be provided by a Focusing Aerogel RICH detector (FARICH). FARICH will primarily aid with pion / kaon separation in final open charmonia states (momenta below 5 GeV/c). A free-running (triggerless) data acquisition pipeline to be employed in the SPD results in a high data rate necessitating new approaches to event generation and simulation of detector responses. Several machine learning based approaches are described here, generating high-level reconstruction observables as well as full Cherenkov rings using a generative neural network. The fast simulation is trained using Monte-Carlo simulated data samples. We compare different approaches and demonstrate that they produce high-fidelity samples.
Speakers: Fedor Ratnikov, Foma Shipilov -
148
Feynman integrals at large loop order and the $\log-\Gamma$ distribution
I will present joint work on the behavior of Feynman integrals and perturbative expansions at large loop orders. Using the tropical sampling algorithm for evaluating Feynman integrals, along with a dedicated graph-sampling algorithm to generate representative sets of Feynman diagrams, we computed approximately $10^7$ integrals with up to 17 loops in four-dimensional $\phi^4$ theory. Through maximum likelihood fits, we find that the values of these integrals at large loop order are distributed according to a log-gamma distribution. This empirical observation opens up a new avenue towards the large-order behavior in perturbative quantum field theory. Guided by instanton considerations, we extrapolate the primitive contribution to the $\phi^4$ beta function to all loop orders.
Speaker: Andrea Favorito -
149
GAN-based Particle Identification over Large Hadron Collider beauty Run III Data
The performance of Particle Identification (PID) in the LHCb experiment is critical for numerous physics analyses. Classifiers, derived from detector likelihoods under various particle mass hypotheses, are trained to tag particles using calibration samples that involve information from the Ring Imaging Cherenkov (RICH) detectors, calorimeters, and muon identification chambers. However, these control channels often differ significantly in feature distributions from the physics channels under study. This mismatch limits the precision with which PID response can be predicted in analyses, particularly in statistically limited datasets like the beam-gas ones collected at LHCb with the System for Measuring Overlap with Gas (SMOG).
In this work, we propose a novel deep generative strategy to learn multidimensional PID distributions from real calibration data using a GAN-based architecture (PIDGAN). A GAN-based architecture enables the generalization across multiple calibration channels, effectively learning high-dimensional PID responses conditioned on experimental features. Our method opens a path towards improved PID calibration with scalable, data-driven models that capture correlations and non-linear effects in PID variables more comprehensively, offers an alternative approach to PID studies for physics analyses, and illustrates a broader strategy for generative modeling of real-world, high-dimensional sensor data.
Speakers: Josef Ruzicka Gonzalez (Costa Rica Center for High Technology), Saverio Mariani (CERN), Sergio Arguedas Cuendis (Consejo Nacional de Rectores (CONARE) (CR)) -
150
GPT-like transformer model for silicon tracking detector simulation
Simulating physics processes and detector responses is essential in high energy physics but accounts for significant computing costs. Generative machine learning has been demonstrated to be potentially powerful in accelerating simulations, outperforming traditional fast simulation methods. While efforts have focused primarily on calorimeters initial studies have also been performed on silicon detectors.
This work employs the use of GPT-like transformer architecture in a fully generative way ensuring full correlations between individual hits. Taking parallels from text generation hits are represented as a flat sequence of feature values. The resulting tracking performance, evaluated on the Open Data Detector, is comparable with the full simulation.
Speaker: Tadej Novak (Jozef Stefan Institute (SI)) -
151
Improvements on QAOA for Particle Trajectories at LHCb
Reconstructing the trajectories of charged particles as they traverse several detector layers is a key ingredient for event reconstruction at LHC and virtually any particle physics experiment. The limited bandwidth available, together with the high rate of tracks per second O(10^10) - where each track consists of a variable number of measurements - makes this problem exceptionally challenging from the computational perspective. With this in mind, Quantum Computing is being explored as a new technology for future detectors, where larger datasets will further complicate this task [1].
Several quantum algorithms have been explored in this regard - e.g., Variational algorithms and HHL [2][3] - offering a heterogeneous set of advantages and disadvantages. In this talk, an extensive study using the Quantum Approximate Optimization Algorithm (QAOA) for track reconstruction at LHC will be presented. This algorithm is focused on finding the ground state for combinatorial problems, thus making it a natural choice. Furthermore, the robustness of QAOA to hardware noise when compared to other algorithms makes it a good candidate for the near-term utility era in Quantum Computing. In this talk, implementations with simplified simulations will be presented, both for QAOA and a modified version of the algorithm that could improve performance in comparison with Quantum annealers as per recent Q-CTRL results [4]. Finally, a complete study of hardware requirements, prospects on improving scalability, and energy consumption for different technologies will also be discussed.
[1] QC4HEP Working Group, A. Di Meglio, K. Jansen, I. Tavernelli, J. Zhang et al., Quantum Computing for High-Energy Physics: State of the Art and Challenges. Summary of the QC4HEP Working Group, PRX Quantum 5 (2024) 3, 037001, arXiv:2307.03236 (2023).
[2] A. Crippa, L. Funcke, T. Hartung, B. Heinemann, K. Jansen, A. Kropf, S. Kühn, F. Meloni, D. Spataro, C. Tüysüz, Y. C. Yap, Quantum Algorithms for Charged Particle Track Reconstruction in the LUXE Experiment, Comput Softw Big Sci 7, 14, arXiv:2304.01690 (2023).
[3] D. Nicotra, M. Lucio Martinez, J. A. De Vries, M. Merk, K. Driessens, R. L. Westra, D. Dibenedetto, D. H. Campora Perez, A quantum algorithm for track reconstruction in the LHCb vertex detector, JINST 18 P11028 (2023).
[4] N. Sachdeva, G. S. Hartnett, S. Maity et al., Quantum optimization using a 127-qubit gate-model IBM quantum computer can outperform quantum annealers for nontrivial binary optimization problems,arXiv:2406.01743v4 (2024)Speaker: Miriam Lucio Martinez (Univ. of Valencia and CSIC (ES)) -
152
Integrating PanDA Harvester with Globus Compute for Portable HPC Execution of ATLAS Workflows
We present a novel integration of the PanDA workload management system (PanDA WMS) and Harvester with Globus Compute to enable secure, portable, and remote execution of ATLAS workflows on high-performance computing (HPC) systems. In our approach, Harvester, which runs on an external server, is used to orchestrate job submissions via Globus Compute’s multi-user endpoint (MEP). This MEP provides a function-as-a-service interface to the HPC resources, eliminating the need for custom, site-specific gateways while ensuring dynamic runtime configuration and secure access, without direct shell logins.
Using NERSC’s Perlmutter as our initial testbed, we have successfully deployed PanDA pilot jobs requiring shared services such as CVMFS for ATLAS software distribution and configuration management. The integration efficiently addresses key challenges including dynamic resource provisioning, runtime environment setup, and secure multi-user operation on HPC edge systems. We discuss our design decisions, the benefits of using a Globus Compute MEP, and strategies for mitigating configuration and dependency issues. Our results on Perlmutter and other HPC clusters demonstrate that our integration effectively supports the ATLAS simulation workloads with minimal overhead and performance portability. Our experiences further highlight the value of integrating PanDA Harvester with Globus Compute for portable HPC execution, demonstrating how this approach provides a scalable, adaptable, and secure remote execution solution that can be broadly applied across diverse scientific workflows beyond ATLAS.Speaker: Doug Benjamin (Brookhaven National Laboratory (US)) -
153
Latest improvements to CATHODE
Despite compelling evidence for the incompleteness of the Standard Model and an extensive search programme, no hints of new physics have so far been observed at the LHC. Anomaly detection was proposed as way to enhance the sensitivity of generic searches not targetting any specific signal model. One of the leading methods in this field, CATHODE (Classifying Anomalies THrough Outer Density Estimation), has recently been applied to the data collected by the CMS experiment. CATHODE starts by obtaining an in-situ estimate of the background and subsequently isolates signal events with a classifier.
We present the most recent developments to CATHODE, enhancing its sensitivity beyond dijet resonances and introducing uncertainties in the generative model.
Speaker: Louis Moureaux (Hamburg University (DE)) -
154
LHCb Tracking Reconstruction and Ghost Rejection at 30 MHz
The new fully software-based trigger of the LHCb experiment operates at a 30 MHz data rate and imposes tight constraints on GPU execution time. Tracking reconstruction algorithms in this first-level trigger must efficiently select detector hits, group them, build tracklets, account for the LHCb magnetic field, extrapolate and fit trajectories, and select the best track candidates to filter events that reduces the 4 TB/s data rate by a factor of 30. Optimized algorithms have been developed with this aim. One of the main challenges is the reduction of “ghost” tracks—fake combinations arising from detector noise or reconstruction ambiguities. A dedicated neural network architecture, designed to operate at the high LHC data rate, has been developed, achieving ghost rates below 20%. The techniques used in this work can be adapted for the reconstruction of other detector objects or for tracking reconstruction in other LHC experiments.
Speaker: Jiahui Zhuo (Univ. of Valencia and CSIC (ES)) -
155
LLM-based Code Documentation, Generation, and Optimization AI Assistant
Recent advancements in large language models (LLMs) have paved the way for tools that can enhance the software development process for scientists. In this context, LLMs excel at two tasks -- code documentation in natural language and code generation in a given programming language. The commercially available tools are often restricted by the available context window size, encounter usage limits, or sometimes incur a substantial cost for large scale development/modernization of gigantic scientific codebases. Moreover, there are data privacy/security concerns. Thus, a programmatic framework that can be used from the Linux terminal on a local server is needed. This framework should be less laborious to use by batching code documentation of large codebases and must be able to use the latest open models offline for code generation.
We present a retrieval-augmented generation (RAG)-based AI assistant for code documentation and generation which is entirely local and scalable. The advantage of this setup is the availability of a large context window free of any external recurring costs while ensuring transparency. The code documentation assistant has three components -- (a) Doxygen style comment generation for all functions and classes by retrieving relevant information from RAG sources (papers, posters, presentations), (b) file-level summary generation, and (c) an interactive chatbot. This will help improve code comprehension for new members in a research group and enhance the understanding of sparsely documented codebases. The code generation assistant splits the code in self-contained chunks before embedding – this strategy improves code retrieval in large codebases. We compare different text and code embedding models for code retrieval. This is followed by AI generated suggestions for performance optimization and accurate refactoring while employing call graphs knowledge to maintain comprehensiveness in large codebases. Additionally, we discuss the guardrails required to ensure code maintainability and correctness when using LLMs for code generation.
Speakers: Mohammad Atif (Brookhaven National Laboratory), Doug Benjamin (Brookhaven National Laboratory (US)) -
156
LLM-based physics analysis assistant at BESIII and exploration of future AI scientist
The data processing and analyzing is one of the main challenges at HEP experiments. To accelerate the physics analysis and drive new physics discovery, the rapidly developing Large Language Model (LLM) is the most promising approach, it have demonstrated astonishing capabilities in recognition and generation of text while most parts of physics analysis can be benefitted. In this talk we will discuss the construction of a dedicated intelligent agent, an AI assistant names Dr.Sai at BESIII based on LLM, the potential usage to boost hadron spectroscopy study, and the future plan towards a AI scientist.
Speakers: Beijiang Liu, Changzheng YUAN, Ke Li (Chinese Academy of Sciences (CN)), Zhengde Zhang (中国科学院高能物理研究所) -
157
Lossy compression in ATLAS offline analysis formats
In view of reducing the disk size of the future analysis formats for High Luminosity LHC in the ATLAS experiment, we have explored the use of lossy compression in the newly developed analysis format known as PHYSLITE. Improvements in disk size are being obtained in migrating from the 'traditional' ROOT TTree format to the newly developed RNTuple format. Lossy compression can bring improvements on top of the backend technology. A system has been developed which allows to specify the desired level of lossy float compression independently for the different sets of physics objects being written out. We will discuss some observed implications for physics measurements as well as the level of size reduction one observes with the backend technologies of TTree and RNTuple.
Speaker: R D Schaffer (Université Paris-Saclay (FR)) -
158
Machine Learning-Driven Anomaly Detection in Dijet Events with ATLAS
This contribution discusses an anomaly detection search for narrow-width resonances beyond the Standard Model that decay into a pair of jets. Using 139 fb−1 of proton-proton collision data at sqrt(s) = 13 TeV, recorded from 2015 to 2018 with the ATLAS detector at the Large Hadron Collider, we aim to identify new physics without relying on a specific signal model. The analysis employs two machine learning strategies to estimate the background in different signal regions, with weakly supervised classifiers trained to differentiate this background estimate from actual data. We focus on high transverse momentum jets reconstructed as large-radius jets, using their mass and substructure as classifier inputs. After a classifier-based selection, we analyze the invariant mass distribution of the jet pairs for potential local excesses. Our model-independent results indicate no significant local excesses and we inject a representative set of signal models into the data to evaluate the sensitivity of our methods. This contribution discusses the used methods and latest results and highlights the potential of machine learning in enhancing the search for new physics in fundamental particle interactions.
Speaker: Dennis Daniel Nick Noll (Lawrence Berkeley National Lab (US)) -
159
New Transformation Capabilities and Workflow Integration for ServiceX, a Delivery System for Distributed Data
The ServiceX project aims to provide a data extraction and delivery service for HEP analysis data, accessing files from distributed stores and applying user-configured transformations on them. ServiceX aims to support many existing analysis workflows and tools in as transparent a manner as possible, while enabling new technologies. We will discuss the most recent backends added to ServiceX, including RDataFrame support and the ability to read and write RNTuples. We will also discuss usability improvements in the user client libraries, in particular the ability to store and propagate metadata to downstream tools.
Speaker: Artur Cordeiro Oudot Choi (University of Washington (US)) -
160
OmniFold-HI: an Advanced ML Unfolding for Heavy-Ion Data
To compare collider experiments, measured data must be corrected for detector distortions through a process known as unfolding. As measurements become more sophisticated, the need for higher-dimensional unfolding increases, but traditional techniques have limitations. To address this, machine learning-based unfolding methods were recently introduced. In this work, we introduce OmniFold-HI, an extension of OmniFold [1] to incorporate detector fakes, inefficiencies, and statistical uncertainties, enabling its application in heavy-ion collisions. By introducing auxiliary observables, we show that high-dimensional unfolding—up to 18 dimensions—significantly improves performance and reduces systematic uncertainties. We also propose a novel strategy for unfolding in the presence of large backgrounds, avoiding traditional background subtraction, and instead unifying calibration and unfolding into a single, consistent framework. Our results establish a foundation for robust, high-dimensional ML-based unfolding in complex collider environments.
[1] Andreassen et. al, Phys. Rev. Lett. 124, 182001 (2020)
Speakers: Adam Takacs (Heidelberg University), Alexandre Falcão (University of Bergen) -
161
Parallel reconstruction profiling on multiple Hygon GPUs for ptychography in HEPS
When we try to move the software named ‘Hepsptycho’, which is a ptychography reconstruction program originally based on multiple Nvidia GPU and MPI techs, to run on the Hygon DCU architectures, we found that the reconstructed object and probe encountered an error while the results running on Nvidia GPUs are correct. We profiled the ePIE algorithm using NVIDIA Nsight Systems and Hygon's HIP-compatible profiler (Hipprof). Multiple GPUs will communicat and share with each other the object and probe information after the batch or iteration computation completes as slave GPUs send the reconstructed results back to GPU 0 using the Reduce or AllReduce function. Nvidia CUDA toolkit could successfully execute the communication. Hygon DCU 0 encounters a memory corruption error during synchroni-zation, likely due to race conditions when updating the object/probe buffers. We show the profiling results here and how we repair this bug. Here we also show the computational speedup using other HPC techs to get a better recon-struction performance on multi GPUs. This work is implemented within Institute of High Energy Physics (IHEP) DAISY framework.
Speaker: lei wang (Institute of High Energy Physics) -
162
Parametrizing workflows with ParaO and Luigi
Workflow tools provide the means to codify complex multi-step processes, thus enabling reproducibility, preservation, and reinterpretation efforts. Their powerful bookkeeping also directly supports the research process, especially where intermediate results are produced, inspected, and iterated upon frequently.
In Luigi, such a complex workflow graph is composed of individual tasks that depend on one another, where every part can be customized at runtime through parametrization. However, Luigi falls short with regards to the steering of parameters, accounting for the consequences thereof, and the modification or reuse of task graphs.
This is where the parameter handling of ParaO shines: it has vastly extended key mechanics and value coercion while automatically propagating their effects throughout the task graph. Since the dependencies are described through parameters too, the same principles can be used to freely alter or transplant (parts of) the task graph, thereby empowering reuse. At the same time, ParaO remains largely compatible with Luigi and packages building upon it, such as Law.Speaker: Benjamin Fischer (RWTH Aachen University (DE)) -
163
Parton-shower-matched predictions for top-quark pair production and decay
In this presentation, I will discuss recent advancements in NNLO+PS predictions for top-quark pair production and decay within the MiNNLO framework. MiNNLO provides a robust method for incorporating next-to-next-to-leading order (NNLO) QCD corrections directly into fully differential predictions, offering unprecedented accuracy. This approach enables a consistent treatment of both production and decay processes, ensuring realistic event simulation compatible with experimental analyses. I will highlight the theoretical developments, key challenges, and the impact of these improvements on phenomenological studies, with a focus on their relevance to the increasing precision demands of LHC experiments.
Speaker: Chiara Signorile -
164
Point-clouds based generative models on hadronic showers
Simulating showers of particles in highly-granular detectors is a key frontier in the application of machine learning to particle physics. Achieving high accuracy and speed with generative machine learning models can enable them to augment traditional simulations and alleviate a major computing constraint.
Recent developments have shown how diffusion based generative shower simulation approach that do not rely on a fixed structure, but instead generates geometry-independent point clouds, are very efficient. We present two novel transformer-based architecture: a diffusion model and a conditional flow matching that were previously used for simulating only electromagnetic showers in the highly granular electromagnetic calorimeter of ILD. The attention mechanism allows to generate complex hadronic showers from pions with more pronounced substructure in the electromagnetic and hadronic calorimeter together. This is the first time that ml methods are used to generate hadronic showers in highly granular imaging calorimeters.
Speaker: Thorsten Lars Henrik Buss (Universität Hamburg) -
165
Progress on AI-Assisted Detector Design for the EIC (AID(2)E)
Artificial Intelligence is set to play a transformative role in designing large and complex detectors, such as the ePIC detector at the upcoming Electron-Ion Collider (EIC). The ePIC setup features a central detector and additional systems positioned in the far forward and far backward regions. Designing this system involves balancing many factors—performance, physics goals, and cost—while also meeting strict mechanical and geometric constraints.
This project introduces a scalable, distributed AI-assisted framework for detector design, known as AID(2)E. It uses advanced multi-objective optimization methods to tackle the complexities of the detector configuration. The framework is built on the ePIC software stack and relies on GEANT4 simulations. It supports clear parameter definitions and integrates AI techniques to improve and speed up the design process.
The workflow is powered by the PanDA and iDDS systems, which are widely used in major experiments like ATLAS at CERN, the Rubin Observatory, and sPHENIX at RHIC. These systems help manage the heavy computing needs of ePIC simulations. Modifications made to PanDA in this project aim to improve usability, scalability, automation, and monitoring.
The main goal is to build a strong design capability for the ePIC detector using a distributed AI-assisted workflow. This approach will also be extended to support the development of the EIC’s second detector (Detector-2), as well as tasks like calibration and alignment. In parallel, we are creating new data science tools to better understand and manage the complex trade-offs revealed during the optimization.
We will present recent updates to the AID(2)E framework, with a focus on new features that support scalable and efficient detector design. Specifically, we will show how distributed optimization is integrated into the ePIC software stack and demonstrate its use in large-scale simulation and design optimization efforts.
Speaker: Karthik Suresh (College of William and Mary) -
166
Quantum Generative Modeling for Calorimeter Simulations in Noisy Quantum Device
Quantum generative modeling provides an alternative framework for simulating complex processes in high-energy physics. Calorimeter shower simulations, in particular, involve high-dimensional, stochastic data and are essential for particle identification and energy reconstruction at experiments such as those at the LHC. As these simulations increase in complexity—especially in large-scale analyses—classical methods become increasingly demanding, making them a natural candidate for quantum approaches. This work investigates the use of parameterized quantum circuits on Noisy Intermediate-Scale Quantum (NISQ) devices to generate calorimeter-like images, assessing their viability for future applications.
The Quantum Angle Generator (QAG) is introduced as a variational quantum model designed for image generation. An extensive hyperparameter study is conducted, along with a comprehensive comparison against classical generative models, to evaluate relative performance and limitations within this domain.
A key component of the study is the evaluation of robustness to hardware noise. The QAG’s ability to adapt to realistic noise conditions is tested through both simulation and deployment on actual quantum devices. Results show that models trained directly on hardware can internalize device-specific noise characteristics, maintaining stable performance even under substantial noise and calibration drift. These findings support the feasibility of near-term quantum generative models for practical use in high-energy physics simulations.
Speaker: Saverio Monaco -
167
Re-discovery of $Z_c(3900)$ at BESIII Based on Quantum Machine Learning
The $Z_c(3900)$ was first discovered by the Beijing Spectrometer (BESIII) detector in 2013. As one of the most attractive discoveries of the BESIII experiment, $Z_c(3900)$ itself has inspired extensive theoretical and experimental research on its properties. In recent years, the rapid growth of massive experimental data at high energy physics (HEP) experiments have driven a lot of novel data-intensive data analysis techniques, with the Quantum Machine Learning (QML) being one of them. However, for data analysis of HEP experiments, the practical viability of QML still remains a topic of debate, requiring more examples of real data analysis with quantum hardware for its further verification.
Based on this idea, this research focuses on the application of QML in the re-discovery of $Z_c(3900)$ using the same data sample at $\sqrt{s} = 4.26 \ \mathrm{GeV}$. We developed a quantum support vector machine to distinguish the $Z_c(3900)$ signals from backgrounds, with classical cut-based and ML-based analysis strategy as references. To evaluate the impact of realistic hardware environment, the analysis will also be conducted on the Origin Quantum system based on superconducting quantum bit techniques. We carefully studied the impacts of different input features, encoding circuit structures as well as hardware noises for a better understanding of the application of QML to realistic data analysis for HEP experiments.
Speaker: Siyang Wu (Shandong University) -
168
Redefining the target for full detector reconstruction algorithms
One of the main points of object reconstruction is the definition of the targets of our reconstruction. In this talk we present recent developments on the topic, focusing on how we can embed detector constraints, mainly calorimeter granularity, in our truth information and how this can impact the performance of the reconstruction, in particular for Machine Learning based approaches. We will first discuss how we use a merging algorithm to check for overlapping showers, defining a merged ground truth for the calorimeters. We also present the results coming from the usage of this redefined truth information during the model training, testing different configurations of the merging algorithm, in order to test their effects on physics performance.
Speakers: Alessandro Brusamolino (KIT - Karlsruhe Institute of Technology (DE)), Katharina Schäuble -
169
Research on Benchmark Testing Method Based on JUNO Offline Software
To standardize the evaluation of computational capabilities across various hardware architectures in data centers, we developed a CPU performance benchmarking tool within the HEP-Score framework. The tool uses the JUNO offline software as a realistic workload and produces standardized outputs aligned with HEP-Score requirements. Our tests demonstrate strong linear performance characteristics under full CPU utilization, along with high reliability and reproducibility.
Furthermore, this study presents a comparative analysis of different CPU architectures, revealing performance characteristics and workload-specific bottlenecks in the context of HEP applications. Power consumption metrics are also incorporated to evaluate performance-per-watt efficiency, offering a valuable perspective on energy-aware computing. Based on this combined performance and power efficiency evaluation, we provide practical hardware procurement recommendations tailored to the needs of large-scale high-energy physics computing environments. This benchmarking method delivers a robust reference for evaluating, selecting, and optimizing computing resources under both performance and sustainability constraints.
Speaker: Xiaofei Yan (Institute of High Energy Physics(IHEP)) -
170
Research on key techniques for performance optimization of astronomical satellite data processing
Astronomical satellites serve as critical infrastructure in the field of astrophysics, and data processing is one of the most essential processes for conducting scientific research on cosmic evolution, celestial activities, and dark matter. Recent advancements in satellite sensor resolution and sensitivity have led to petabyte (PB)-scale data volumes, characterized by unprecedented scale and complexity, posing significant challenges to data processing. However, traditional data processing methods are facing the issues including intricate interdependencies among multi-level data products (e.g., Level 0 to Level 2), limited memory resources, and high memory occupancy rates, which collectively affect data processing efficiency. To address these issues, this study proposes a performance optimization framework for astronomical data processing. Firstly, an adaptive data chunking model is established, which realizes the data partitioning dynamically based on real-time memory availability and computational load. Secondly, a multi-level memory management method is presented, which optimizes memory utilization through caching the frequently accessed data into memory and building the priority-based queuing mechanism. Finally, a parallel data processing interface is introduced, which is developed to transform the algorithm from single-threaded serial execution to parallel processing. Experiments are conducted to verify the availability and practicality of the proposed framework, and the result shows that the data processing efficiency has been improved by 20%, effectively solving the deficiencies in traditional methods. The research outcomes will be implemented in the data processing tasks of the enhanced X-ray Timing and Polarimetry (eXTP) satellite, while also providing guidance for the data processing workflows of other astronomical satellites.
Speaker: Shuang Wang (IHEP) -
171
Running Experience with $\texttt{Optuna}$ for the Extraction of a HEP Signal by $\texttt{XGBoost}$
Hyperparameter optimization plays a crucial role in achieving high performance and robustness for machine learning models, such those used in complex classification tasks in High Energy Physics (HEP).
In this study, we investigate and experience the usage of $\texttt{Optuna}$, a rather new, modern and scalable optimization tool in the framework of a realistic signal-versus-background classification scenario carried out by applying $\texttt{XGBoost}$ on CMS Open Data.The chosen classification task consists in extracting the signal associated to the decay mode of $\mathrm{B}_s \rightarrow \mathrm{J}/\psi(\mu^+\mu^-)~\phi(K^+K^-)$ by means of a gradient boost tree ($\texttt{XGBoost}$) trained on both Monte Carlo simulated signal sample and a background one taken from the data as invariant mass sidebands, in the $\phi(\to K^+K^-)$ spectrum ?. The optimization process of $\texttt{XGBoost}$ is guided by $\texttt{Optuna}$ with the aim to maximize the area under the ROC curve (AUC) while applying an overfitting control mechanism, whereas the Punzi Figure of Merit is used for a performant extraction of the signal within $\texttt{XGBoost}$.
This work demonstrates how $\texttt{Optuna}$ is a suitable tool that enables efficient and effective exploration of the hyperparameter space in commonly used HEP workflows, while providing valuable diagnostics on the automated model optimization.
Speaker: Umit Sozbilir (Universita e INFN, Bari (IT)) -
172
Simulating the ATLAS Distributed Computing Infrastructure to Optimize Workload Allocation Strategies
In large-scale distributed computing systems, workload dispatching and the associated data management are critical factors that determine key metrics such as resource utilization of distributed computing and resilience of scientific workflows. As the Large Hadron Collider (LHC) advances into its high luminosity era, the ATLAS distributed computing infrastructure must improve these metrics to manage exponentially larger data volumes (exceeding ExaBytes) and support the demanding needs of high-energy physics research.
To improve the distributed computing operation, the existing workload allocation strategies can be optimized, or novel strategies can be designed. However, in practice, it is not viable to test new workload allocation strategies on the actual ATLAS distributed computing. To address this, we have developed an agile simulation framework of the ATLAS distributed computing system using the SimGrid toolkit to evaluate and refine workload dispatching strategies for heterogeneous computing infrastructure. Moreover, it is crucial to also address the inherent overhead and potential bottlenecks associated with the management of the large data volumes required by these workloads. Therefore, we extensively analyze the historical remote transfers to understand the root causes of slowdowns, vulnerabilities, and inefficient resource utilization linked to data movement.
To ensure the accuracy and reliability of our framework, we calibrate and validate the ATLAS distributed computing implementation in the simulation framework by testing real workloads from historical ATLAS data. This calibrated simulation framework will serve as the testbed for evaluating custom allocation algorithms and also generate the datasets required to train ML surrogates to enable fast and scalable simulations. In addition, an interactive monitoring interface is being developed to visualize the workload dispatching and the resource utilization. Apart from serving as a platform for testing and executing new strategies that can improve the resilience of the ATLAS distributed computing, our framework is ultimately experiment-agnostic and open sourced, providing an example case that can enable users to configure large-scale distributed computing grids and implement custom workload allocation algorithms through dynamic plugins.Speaker: Raees Ahmad Khan (University of Pittsburgh (US)) -
173
Sustainability studies of big data processing in real time for HEP
The LHCb collaboration is currently using a pioneer system of data filtering in the trigger system, based on real-time particle reconstruction using Graphics Processing Units (GPUs). This corresponds to processing 5 TB/s of data and has required a huge amount of hardware and software developments. Among them, the corresponding power consumption and sustainability is an imperative matter in view of the next high luminosity era for the LHC collider, which will largely increase the output data rate. In the context of the High-Low project at IFIC in Valencia, several studies have been performed to understand how to optimize the energy usage in terms of the computing architectures and the efficiency of the algorithms which are running on them. In addition, a strategy is designed to evaluate the potential impact of quantum computing as it begins to enter in the field.
Speaker: Volodymyr Svintozelskyi (Univ. of Valencia and CSIC (ES)) -
174
The High-throughput Data I/O framework for HEPS
The High Energy Photon Source produces vast amounts of diverse, multi-modal data annually, with IO bottlenecks increasingly limiting scientific computational efficiency. To overcome this challenge, our approach introduces a threefold solution. First, we develop daisy-io, which has a unified IO interface designed for cross-disciplinary applications, which integrates accelerated data retrieval techniques such as parallel processing and prefetching to optimize access speeds across heterogeneous datasets. Second, we construct a data streaming platform that eliminates disk read/write bottlenecks through real-time data handling. This platform incorporates three core components: a stream ingestion module for dynamic data reception, a stream parsing module for on-the-fly structural processing, and a stream buffering module for temporary data staging. Finally, to further enhance data transmission efficiency, we implement a lightweight serialization protocol and domain-specific compression algorithms, minimizing latency and bandwidth demands. Collectively, these innovations not only accelerate data read/write operations but also abstract complexities arising from disparate data sources and formats, enabling seamless integration into scientific workflows while maintaining adaptability across experimental scenarios.
Speaker: FU Shiyuan fusy -
175
Toward robust Deep Learning
The increasing reliance on machine learning (ML) and particularly deep learning (DL) in scientific and industrial applications requires models that are not only accurate, but also reliable under varying conditions. This is especially important for automated machine learning and fault-tolerant systems where there is limited or no human control. In this paper, we present a novel, task-independent approach for assessing the robustness of machine learning models. Our methodology quantifies model robustness across different training samples and weight initialisations, using statistical measures of test loss variability. In addition, we propose a meta-algorithm for selecting reliable models from a set of candidates, balancing performance and robustness. We apply our approach to deep learning architectures with a small number of convolutional and fully connected layers, allowing us to efficiently explore thousands of configurations. Applying this method we have identified robust models for two regression problems by investigating the effects of training sample size, weight initialisation, and inductive bias. Our results show that model robustness depends on whether raw or high-level features are used, and we show that incorporating inductive bias can reduce training time and prediction performance without degrading model robustness. The proposed model robustness measurement and selection strategies can be integrated into existing AutoML systems, providing a novel approach to automated and robust model development for high-risk environments such as scientific instrument design and complex data-driven workflows.
Speakers: Andrey Shevelev, Fedor Ratnikov -
176
Transformer-based Track Fitting for HL-LHC
As the High-Luminosity LHC (HL-LHC) era approaches, significant improvements in
reconstruction software are required to keep pace with the increased data rates and
detector complexity. A persistent challenge for high-throughput event reconstruction is
the estimation of track parameters, which is traditionally performed using iterative
Kalman Filter-based algorithms. While GPU-based track finding is progressing rapidly, the
fitting stage remains a bottleneck. The main slowdown is coming from data movement
between CPU and GPU which reduce the benefits of acceleration.
This work investigates a deep learning-based alternative using Transformer architectures
for the prediction of the track parameters. We evaluate the approach in a realistic setting
using the ACTS software framework with the Open Data Detector (ODD) geometry on full
simulation and Kalman Filter for baseline comparison, observing promising results.Speaker: Jeremy Couthures (Laboratoire d'Annecy de physique des particules, CNRS / Univ. Savoie Mont Blanc (FR)) -
177
Using ServiceX to prepare training data for an ATLAS Long-Lived Particle Search
ServiceX, a data extraction and delivery service for HEP experiments, is being used in ATLAS to prepare training data for a long-lived particle search. The training data contains low-level features not available in the ATLAS experiment’s PHYSLITE format - making ServiceX’s ability to read complex event data (e.g. ATLAS’s xAOD format) ideally suited to solving this problem. This poster will demonstrate the code, the workflow, meta-data used, and how the code was packaged so the analysis team could use and modify the code and data extraction.
Speaker: Gordon Watts (University of Washington (US)) -
178
Vision transformers for fast and generalizable detector simulation
The speed and fidelity of detector simulations in particle physics pose compelling questions on future LHC analysis and colliders. The sparse high-dimensional data combined with the required precision provide a challenging task for modern generative networks. We present a general framework to train generative networks on any detector geometry with minimal user input. Vision transformers allow us to reliably simulate the energy deposition in the detector phase space starting from the detailed Geant4 detector response. We evaluate the networks using high-level observables, neural network classifiers, and sampling timings over various datasets with different dimensionalities and physics content.
Speaker: Luigi Favaro (Universite Catholique de Louvain (UCL) (BE)) -
179
Zero-overhead ML training with ROOT in an ATLAS Open Data analysis
The ROOT software framework is widely used in HEP for storage, processing, analysis and visualization of large datasets. With the large increase in usage of ML for experiment workflows, especially lately in the last steps of the analysis pipeline, the matter of exposing ROOT data ergonomically to ML models becomes ever more pressing. In this contribution we discuss the experimental component of ROOT that exposes ROOT datasets in batches ready for the training phase. A new shuffling strategy for creating the batches to prevent biased training is discussed, taking as examples real-life use cases relative to ATLAS Open Data.
An end-to-end ML physics analysis is carried out to show how training a model with common ML tools can be done directly from ROOT datasets to avoid intermediate data conversions, streamline workflows and used in the case where the training data does not fit in memory. Datasets from ATLAS Open Data are used as input to analyses searching for the Higgs boson or new BSM particles such as supersymmetric particles. The datasets are stored in the new on-disk ROOT format called RNTuple.Speaker: Martin Foll (University of Oslo (NO)) -
180
ML-unfolding without prior dependence
Machine learning methods enable unbinned and full-dimensional unfolding. However, existing approaches, both classifier-based and generative, suffer from prior dependence. We propose a new method for ML-based unfolding that is completely prior independent and infers the unfolded distribution in a fully frequentist manner. Using several benchmark datasets, we demonstrate that the method can infer unfolded distributions to percent-level precision.
Speaker: Theo Heimel (UCLouvain)
-
125
-
Track 1: Computing Technology for Physics Research ESA M
ESA M
Conveners: chair: Nicholas Smith, co-chair: Fazhi Qi-
181
Integrating and Validating ARM Resources at CMS
For over two decades, computing resources in the Worldwide LHC Computing Grid (WLCG) have been based exclusively on the x86 architecture. However, in the near future, heterogeneous non-x86 architectures are expected to make up a significant portion of the resources available to LHC experiments driven also by their adoption in current and upcoming world-class HPC facilities. In response to this shift, the CMS experiment has begun preparing by building the CMS software stack (CMSSW) for multiple architectures. To enable production-level use of these diverse resources, workload management and job distribution has also been extended to support heterogeneous computing environments.
Among the emerging architectures, ARM is seeing increasing adoption. As power consumption in computing comes under growing global scrutiny — driven by concerns over carbon emissions and escalating energy costs — energy-efficient solutions are becoming more attractive. ARM processors, known for their low power usage and widespread deployment in mobile devices, have not yet been broadly adopted as capacity hardware within the Worldwide LHC Computing Grid.
In this contribution we will describe the efforts done by the CMS experiment to include and test ARM resources, focussing on the challenges in integrating such resources in the central production pool and in validating their usage in term of reproducible and comparable — with respect to x86 — physics performances.
Speakers: Adriano Di Florio (CC-IN2P3), CMS Collaboration -
182
Optimizing Heterogeneous Workflow Construction for Enhanced Event Throughput and Efficient Resource Utilization in CMS
Flexible workload specification and management are critical to the success of the CMS experiment, which utilizes approximately half a million cores across a global grid computing infrastructure for data reprocessing and Monte Carlo production. TaskChain and StepChain specifications, responsible for over 95% of central production activities, employ distinct workflow paradigms: TaskChain executes a single physics payload per grid job, whereas StepChain processes multiple payloads within the same job. As computing resources grow increasingly heterogeneous, encompassing diverse CPU architectures and accelerators, an adaptive workflow specification is essential for efficient scheduling and resource utilization. To address this challenge, we propose a hybrid workflow composition model that dynamically groups tasks based on resource constraints and execution dependencies. This flexible workload construction enhances the adaptability and efficiency of CMS workload management, ensuring optimized resource allocation in an evolving computational landscape.
Speakers: Alan Malta Rodrigues (University of Notre Dame (US)), CMS Collaboration -
183
ServiceX: Streamlining Data Delivery and Transformation for HL-LHC Analyses
As the HL-LHC prepares to deliver large volumes of data, the need for an efficient data delivery and transformation service becomes crucial. To address this challenge, a cross-experiment toolset—ServiceX—was developed to link the centrally produced datasets to flexible, user-level analysis workflows. Modern analysis tools such as Coffea benefit from ServiceX as the first step in event selection, efficiently reducing file sizes through a Kubernetes infrastructure. ServiceX allows query-based data transformers with different backends that provide remote access to tuple ROOT and parquet files and experiment-specific EDM files (e.g. ATLAS xAOD). By facilitating the transformation of heterogeneous data formats into columnar representations, ServiceX can reduce data loads and accelerate analyses in a user-friendly way.
This talk will discuss the ServiceX infrastructure, which enables remote querying and data skimming in distributed systems. Additionally, we will discuss the range of analysis workflows in which ServiceX can be integrated and the recent developments that aim to extend it, notably with its integration in standard ATLAS analysis frameworks. Such developments also introduce new use-case features that increase the available ServiceX-based toolset to analyzers, ensuring the integration of ServiceX in different steps of an analyser workflow.
Speaker: Artur Cordeiro Oudot Choi (University of Washington (US)) -
184
Development and Early Beamline Deployment of the HEPS Scientific Data processing Framework
The High Energy Photon Source (HEPS), a new fourth-generation high-energy synchrotron radiation facility, is set to become fully operational by the end of 2025. With its significantly enhanced brightness and detector performance, HEPS will generate over 300 PB of experimental data annually across 14 beamlines in phase I, quickly reaching the EB scale. HEPS supports a wide range of experimental techniques, including imaging, diffraction, scattering, and spectroscopy, each with significant differences in data throughput and scale. Meanwhile, the emergence of increasingly complex experimental methods poses unprecedented challenges for data processing.
To address the future EB-scale experimental data processing demands of HEPS, we have developed DAISY (Data Analysis Integrated Software System), a general scientific data processing software framework. DAISY is designed to enhance the integration, standardization, and performance of experimental data processing at HEPS. It provides key capabilities, including high-throughput data I/O, multimodal data parsing, and multi-source data access. It supports elastic and distributed heterogeneous computing to accommodate different scales, throughput levels, and low-latency data processing requirements. It also offers a general workflow orchestration system to flexibly adapt to various experimental data processing modes. Additionally, it provides user software integration interfaces and a development environment to facilitate the standardization and integration of methodological algorithms and software across multiple disciplines.
Based on the DAISY framework, we have developed multiple domain-specific scientific applications, covering imaging, diffraction, scattering and spectroscopy, while continuously expanding to more scientific domains. Furthermore, we have optimized key software components and algorithms to significantly improve data processing efficiency. At present, several DAISY-based scientific applications have been successfully deployed on HEPS beamlines, supporting online data processing for users. The remaining applications are scheduled for fully deployment within the year, further strengthening HEPS’s data analysis capabilities.Speaker: Yu Hu -
185
Exploring FAIR Open Science tools for Einstein Telescope.
In the last decade, the concept of Open Science has gained importance: there is a real effort to make tools and data shareable among different communities, with the goal of making data and software FAIR (Findable, Accessible, Interoperable and Reusable). This goal is shared by several scientific communities, including the Einstein Telescope (ET). ET is the third generation ground-based interferometer for the detection of gravitational waves proposed by Europe, which will begin data taking in about ten years. Two projects related to ET computing were proposed and funded within the first OSCARS (Open Science Cluster’s Action for Research and Society) Open Call for cascading grants: MADDEN and ETAP. MADDEN (Multi-RI Access and Discovery of Data for Experiment Networking) is focused on data distribution and management using Rucio. It has three main objectives: build a multi-RI Data Lake managed with Rucio; evaluate RucioFS, a tool to provide a POSIX-like view of the Rucio catalogue in a multi-RI environment; investigate advanced metadata querying capabilities with Rucio. ETAP (Einstein Telescope Analysis Portal) provides a complete environment for data analysis. The main objectives are: adapt and deploy the CERN ESCAPE VRE (Virtual Research Environment) at the University of Geneva and add multi-RI Rich Metadata Services from the HEP Software Foundation (HSF) and a flexible computing resource monitoring service. The projects are strongly interlinked as ETAP will use MADDEN for data management. In this talk we will describe the timeline for these projects, how they will provide input to the ET Computing Model and the current status of the work.
Speakers: Lia Lavezzi (INFN Torino), Lia Lavezzi (INFN Torino (IT)) -
186
JUNO DCI status: distributed computing infrastructure for a large neutrino experiment
The Jiangmen Underground Neutrino Observatory (JUNO) is an underground 20 kton liquid scintillator detector being constructed in southern China. The JUNO physics program aims to explore neutrino properties, particularly through electron anti-neutrinos emitted from two nuclear power complexes at a baseline of approximately 53 km. Targeting an unprecedented relative energy resolution of 3% at 1 MeV, JUNO will study neutrino oscillation phenomena and determine neutrino mass ordering with a statistical significance of about 3 sigma within six years. Currently, JUNO is in the commissioning phase.
During physics data collection, the expected data rate after the global trigger is approximately 40 GB/s, which will be reduced to ~60 MB/s using online event classification. This means an estimated dataflow of about 3PB/year, considering also some auxiliary files.
These challenges are addressed by a large collaboration spanning three continents. A key factor in JUNO's success will be the implementation of a distributed computing infrastructure (DCI) to meet its predicted computing needs.
The development of this computing infrastructure is a joint effort by four data centers already active in the Worldwide LHC Computing Grid (WLCG). Produced data will be stored in these data centers, and data analysis activities will be carried out cooperatively through a coordinated joint effort.
This contribution reports on the design, implementation and deployment of the JUNO DCI, describing its main characteristics and requirements.
Speaker: Giuseppe Andronico (Universita e INFN, Catania (IT))
-
181
-
Track 2: Data Analysis - Algorithms and Tools ESA B
ESA B
Conveners: chair: Tilman Plehn, co-chair: Karim El Morabit-
187
Real-Time event reconstruction for Nuclear Physics Experiments using Artificial Intelligence
Charged track reconstruction is a critical task in nuclear physics experiments, enabling the identification and analysis of particles produced in high-energy collisions. Machine learning (ML) has emerged as a powerful tool for this purpose, addressing the challenges posed by complex detector geometries, high event multiplicities, and noisy data. Traditional methods rely on pattern recognition algorithms like the Kalman filter, but ML techniques, such as neural networks, graph neural networks (GNNs), and recurrent neural networks (RNNs), offer improved accuracy and scalability. By learning from simulated and real detector data, ML models can identify and classify tracks, predict trajectories, and handle ambiguities caused by overlapping or missing hits. Moreover, ML-based approaches can process data in near-real-time, enhancing the efficiency of experiments at large-scale facilities like the Large Hadron Collider (LHC) and Jefferson Lab (JLAB). As detector technologies and computational resources evolve, ML-driven charged track reconstruction continues to push the boundaries of precision and discovery in nuclear physics.
In this talk, we highlight advancements in charged track identification leveraging Artificial Intelligence within the CLAS12 detector, achieving a notable enhancement in experimental statistics compared to traditional methods. Additionally, we showcase real-time event reconstruction capabilities, including the inference of charged particle properties such as momentum, direction, and species identification, at speeds matching data acquisition rates. These innovations enable the extraction of physics observables directly from the experiment in real-time.
Speaker: Gagik Gavalian (Jefferson National Lab) -
188
Machine Learning algorithms for the COSI mission background rejection
The Compton Spectrometer and Imager (COSI) is a NASA Small Explorer (SMEX) satellite mission planned to fly in 2027. It has the participation of institutions in the US, Europe and Asia and aims at the construction of a gamma-ray telescope for observations in the 0.2-5 MeV energy range. COSI consists of an array of germanium strip detectors cooled to cryogenic temperatures with millimeter position resolution for gamma-ray interactions, surrounded by an active BGO shield to reduce background contamination. Goals of COSI are: the study of the electron-positron 511 keV annihilation emission in the Galaxy, mapping the emission from galactic element formation, exploring the polarization in the gamma rays and the detection of transient sources in multimessenger campaigns. The COSI reconstruction pipeline is based on a decade-long heritage
and is being further developed in view of the forthcoming launch. In this contribution we present the efforts to improve the background identification performance of the traditional algorithms with a Machine Learning approach. The expected morphology of the events will be discussed and the algorithms will be presented together with the needed data preprocessing. A preliminary comparison with the traditional methods will be performed.Speaker: Francesco Fenu (Agenzia Spaziale Italiana) -
189
Probing The Invisible Solar System Through Stellar Eclipse Events
Beyond the planet Neptune, only the largest solar system objects can be observed directly. However, there are tens of thousands of smaller objects whose frequency and distribution could provide valuable insights into the formation of our solar system - if we could see them.
Project SOWA (Solar-system Occultation Watch and Analysis) aims to systematically search for such invisible objects and calculate their orbits. Instead of looking for these objects directly, we will examine the raw data from existing telescope archives for stellar occultations. If one of these celestial bodies happens to pass between us and a distant star, it causes a brief "stellar eclipse" - the star appears dimmer than usual for a few seconds to minutes.
According to estimates, archives from telescopes like WISE, GAIA, or TESS, with their trillions of individual observations, could contain hundreds of thousands of such occultations. In the presentation, we report on the algorithms and tools we develop to filter out these occultations from petabytes of data, correlate them, reconstruct the orbits of the celestial bodies, and predict future occultations. These predictions could then be used to observe such events in high resolution, helping us learn more about the invisible celestial bodies at the edge of our solar system.
Speaker: Dr Marcel Völschow (Hamburg University of Applied Sciences) -
190
AI Agents for Ground-Based Gamma Astronomy
The next generation of ground-based gamma-ray astronomy instruments will involve arrays of dozens of telescopes, leading to an increase in operational and analytical complexity. This scale-up poses challenges for both system operations and offline data processing, especially when conventional approaches struggle to scale effectively. To address these challenges, we are developing AI agents built on instruction-finetuned large language models (LLMs). These agents leverage domain-specific documentation and codebases, understand contextual operational requirements, interact with external APIs, and engage with users in natural language. Our prototypes focus on integration with the Cherenkov Telescope Array Observatory pipelines, both for operational workflows and for offline data analysis. In this presentation, we outline our approach, discuss encountered challenges, and highlight future plans.
Speaker: Julian Simon Schliwinski (Humboldt University of Berlin (DE)) -
191
Neural Quasiprobabilistic Likelihood Ratio Estimation with Negatively Weighted Data
In many domains of science the likelihood function is a fundamental ingredient used to statistically infer model parameters from data, due to the likelihood ratio (LR) as an optimal test statistic. Neural based LR estimation using probabilistic classification has therefore had a significant impact in these domains, providing a scalable method for determining an intractable LR from simulated datasets via the so-called ratio trick [1,2].
The underlying paradigm of probabilistic machine learning adheres to the standard Kolmogorov axioms of probability theory [3], which requires the probability of an event in a measurable space to be nonnegative; a requirement met by classical systems. In contrast, quantum mechanical systems can be represented by quasiprobabilistic distributions, which allow for events with negative probabilities [4]. In high energy physics this is a significant problem when simulating proton-proton (pp) collisions using quantum field theory, due to the fact that Monte Carlo simulation codes can introduce negatively weighted data [5,6].
When using the aforementioned neural based ratio trick with negatively weighted data two problems present themselves. First, the variance of the mini-batch losses used during neural network parameter updates are systematically increased, thereby hindering the convergence of stochastic gradient descent (SGD) algorithms. The second is that most classification and density (ratio) estimation loss functions constrain the neural LR estimates to be in the range $[0,\infty)$. Therefore, should negative densities prevail anywhere within the measurable space, the neural network would be incapable of expressing such behaviour.
This work will demonstrate two important advancements for LR estimation with negatively weighted data. First, a new loss function for binary classification is introduced to extend the neural based LR trick to be compatible with quasiprobabilistic distributions. Second, signed probability spaces are used to decompose the likelihoods into signed mixture models. This decomposition reduces the overall LR estimation task into four nonnegative LR estimation sub-tasks, each with reduced loss variance during optimization relative to the overall task. Each nonnegative LR is estimated using a calibrated neural discriminative classifier [2], which are then combined via coefficients that are optionally optimised using the new loss function. The technique is demonstrated using di-Higgs production via gluon-gluon fusion in pp collisions at the Large Hadron Collider.
References
[1] Masashi Sugiyama, Taiji Suzuki, and Takafumi Kanamori. Density Ratio Estimation in Machine Learning. Cambridge University Press, 2012.
[2] Kyle Cranmer, Juan Pavez, and Gilles Louppe. Approximating likelihood ratios with calibrated discriminative classifiers, 2016.
[3] A.N. Kolmogorov. Grundbegriffe der Wahrscheinlichkeitsrechnung. Number 1. Springer Berlin, Heidelberg, 1933.
[4] Richard Phillips Feynman. Negative probability. 1984.
[5] Stefano Frixione and Bryan R Webber. Matching nlo qcd computations and parton shower simulations. Journal of High Energy Physics, 2002(06):029–029, June 2002.
[6] Paolo Nason and Giovanni Ridolfi. A positive-weight next-to-leading-order monte carlo for Z pair hadroproduction. Journal of High Energy Physics, 2006(08):077–077, August 2006.Speaker: Stephen Jiggins (Deutsches Elektronen-Synchrotron (DE))
-
187
-
Track 3: Computations in Theoretical Physics: Techniques and Methods ESA C
ESA C
Conveners: chair: Joshua Davis, co-chair: Tianji Cai-
192
Precise inference of Lund fragmentation functions with the HOMER method
Significant efforts are currently underway to improve the description of hadronization using Machine Learning. While modern generative architectures can undoubtedly emulate observations, it remains a key challenge to integrate these networks within principled fragmentation models in a consistent manner. This talk presents developments in the HOMER method for extracting Lund fragmentation functions from experimental data. We tackle the information gap between latent and observable phase spaces, and quantify uncertainties with Bayesian neural networks.
Speaker: Ayodele Ore -
193
Adaptive Polynomial Chaos As Quantum Born Machines for High-Fidelity Generative Modeling
We present a quantum generative model that extends Quantum Born Machines (QBMs) by incorporating a parametric Polynomial Chaos Expansion (PCE) to encode classical data distributions. Unlike standard QBMs relying on fixed heuristic data-loading strategies, our approach employs a trainable Hermite polynomial basis to amplitude-encode classical data into quantum states. These states are subsequently transformed by a parameterized quantum circuit (PQC), producing quantum measurement outcomes that approximate the target distribution. By using an adaptive polynomial basis and a deeper variational Ansatz, the model maintains the key advantage of QBMs—efficient sampling from quantum-generated distributions—while enhancing expressivity for complex data. We validate this method on electromagnetic shower data from calorimeters, and our results demonstrate its efficacy and potential for broader applications.
Speaker: Jamal Slim (DESY) -
194
Exploring phase space with Flow Matching
We apply for the first time the Flow Matching method to the problem of phase-space sampling for event generation in high-energy collider physics. By training the model to remap the random numbers used to generate the momenta and helicities of the collision matrix elements as implemented in the portable partonic event generator Pepper, we find substantial efficiency improvements in the studied processes. We focus our study on the computationally most relevant highest final-state multiplicities in Drell-Yan and top-antitop pair production used in simulated samples for the Large Hadron Collider, and find that the unweighting efficiencies improve by factors of 80 and 10, respectively, when compared to the standard approach of using a Vegas-based optimisation. For lower multiplicities we find factors up to 100. We also compare Continuous Normalizing Flows trained with Flow Matching against the previously studied Normalizing Flows based on Coupling Layers and find that the former leads to better results, faster training and a better scaling behaviour across the studied multiplicity range.
Speaker: Timo Janssen (University of Göttingen) -
195
MadEvent7 – A New Modular Phase-Space Generator
MadEvent7 is a new modular phase-space generation library written in C++ and CUDA, running on both GPUs and CPUs. It features a variety of different phase-space mappings, including the classic MadGraph multi-channel phase space and an optimized implementation of normalizing flows for neural importance sampling, as well as their corresponding inverse mappings. The full functionality is available through a Python API and easily interfaces with deep learning libraries like PyTorch. It will be one of the core components of the upcoming MadGraph7 release.
Speaker: Theo Heimel (UCLouvain) -
196
Multi-jet Inclusive Phase-Space Sampling using Continuous Normalizing Flows
Modern approaches to phase-space integration combine well-established Monte Carlo methods with machine learning techniques for importance sampling. Recent progress in generative models in the form of continuous normalizing flows, trained using conditional flow matching, offers the potential to improve the phase-space sampling efficiency significantly.
We present a multi-jet inclusive transformer-based phase-space sampler that leverages insights from lower-dimensional phase-spaces to more efficiently generate points in higher-dimensional phase-spaces.Speaker: Konrad Helms -
197
Simulation Based Inference for the Higgs-gauge sector
One primary goal of the LHC is the search for physics beyond the Standard Model, leading to the development of many different methods to look for new physics effects. In this context, we employ Machine Learning methods, in particular we explore the applications of Simulation-Based Inference (SBI), to learn otherwise intractable likelihoods and fully exploit the information available, compared to traditional histogram-based methods. We focus on a variety of di-boson production channels at the LHC, utilizing the complementarity between channels to put more stringent constraints on the Wilson coefficients of Standard Model Effective Field Theory.
Speaker: Nikita Schmal
-
192
-
13:30
Free Time for Sightseeing
-
18:00
Social Dinner Cap San Diego
Cap San Diego
Überseebrücke, 20459 Hamburg
-
-
-
Plenary ESA A
ESA A
Conveners: chair: Ian Fisk, co-chair: Jennifer Ngadiuba (FNAL)-
198
Updates from organizersSpeaker: Gregor Kasieczka (Hamburg University (DE))
-
199
Quantum Computing in Nuclear PhysicsSpeaker: Paul Stevenson (University of Surrey)
- 200
-
201
Foundation time series models for forecasting epidemicsSpeaker: Eugenio Valdano (INSERM)
-
198
-
Poster session with coffee break: Group 2 ESA W 'West Wing'
ESA W 'West Wing'
-
Plenary ESA A
ESA A
Conveners: chair: Axel Naumann, co-chair: Maciej Mikolaj Glowacki (CERN)-
202
Progress in ML for anomaly detectionSpeaker: David Shih
-
203
Optimizing ML models for hardware-aware deploymentSpeaker: Bo-Cheng Lai
-
204
Practical Neural Simulation-Based Inference at the LHC
Neural Simulation-Based Inference (NSBI) is an emerging class of statistical methods that harness the power of modern deep learning to perform inference directly from high-dimensional data. These techniques have already demonstrated significant sensitivity gains in precision measurements across several domains, outperforming traditional approaches that rely on low-dimensional summaries. This talk will focus on the practical application of NSBI in LHC analyses, highlighting recent progress and ongoing efforts in the field. We will explore the challenges of scaling NSBI workflows to complex, high-dimensional parameter spaces, and discuss strategies developed to address these issues. We will also present progress towards the development of a computationally efficient and practical NSBI analysis framework tailored for use in high-energy physics experiments.
Speaker: Jay Ajitbhai Sandesara (University of Wisconsin Madison (US))
-
202
-
13:00
Lunch break ESA O 'East Wing'
ESA O 'East Wing'
Today's menu:
Starter:
- Tomato and mozzarella with fresh basil
- Caesar salad with cherry tomatoes, croutons and Parmesan cheese
served with different dressingsMain course:
- Chicken breast fillet in tomato sauce with fresh herbs, vegetables and parsley potatoes
- Indian eggplant dish "Tikka Masala" with fresh herbs, served with Basmati rice (vegan, gluten-free, lactose-free) -
Track 1: Computing Technology for Physics Research ESA M
ESA M
Conveners: chair: Nicholas Smith, co-chair: Fazhi Qi-
205
The Turbo event model evolution for LHCb Run 3
The LHCb experiment, one of the four major experiments at the Large Hadron Collider (LHC), excels in high-precision measurements of particles that are produced relatively frequently (strange, charmed and bottom hadrons). Key to LHCb's potential is its sophisticated trigger system that enables complete event reconstruction, selection, alignment and calibration in real-time. Through the Turbo stream processing model, the experiment substantially reduces data volume, while preserving full physics potential, by only persisting parts of the event data. Initially deployed during Run 2, this approach has evolved to become the standard processing paradigm for all LHCb physics objectives in Run 3, with significant enhancements and use of its flexibility. This presentation will demonstrate the performance and capabilities of this storage model during Run 3 operations, highlighting how its expanded adaptability has enabled LHCb to optimise finite storage resources while maintaining crucial data redundancy safeguards.
Speaker: Laurent Dufour (CERN) -
206
High-Performance Data Format for Scientific Data Storage and Analysis
In this article, we present the High-Performance Output (HiPO) data format developed at Jefferson Laboratory for
storing and analyzing data from Nuclear Physics experiments. The format was designed to efficiently store large
amounts of experimental data, utilizing modern fast compression algorithms. The purpose of this development was
to provide organized data in the output, facilitating access to relevant information within the large data files. The HiPO
data format has features that are suited for storing raw detector data, reconstruction data, and the final physics analysis
data efficiently, eliminating the need to do data conversions through the lifecycle of experimental data. The HiPO data format
is implemented in C++ and JAVA, and provides bindings to FORTRAN, Python, and Julia, providing users with the choice of data
analysis frameworks to use.
In this paper, we will present the general design and functionalities of the HiPO library and compare the performance of the library with
more established data formats used in data analysis in High Energy and Nuclear Physics (such as ROOT and Parquete). In columnar data analysis, HiPO surpasses established data formats in performance and can be effectively applied to data analysis in other scientific fields.Speaker: Gagik Gavalian (Jefferson National Lab) -
207
Lazy Data Loading in Awkward Array
High-energy physics (HEP) analyses routinely handle massive datasets, often exceeding the available resources. Efficiently interacting with these datasets requires dedicated techniques for data loading and management. Awkward Array is a Python library widely used in high-energy physics (HEP) to efficiently handle complex, irregularly structured ("ragged") data. It transforms flat arrays of data into nested structures, naturally representing physics objects such as particles and their properties. Typically, a physics analysis uses only a subset of these objects or properties, providing an opportunity to significantly reduce memory consumption by loading data from disk lazily, only when explicitly needed.
In this talk, we introduce and demonstrate the new "Virtual Arrays" feature in Awkward Array, enabling lazy loading of data buffers. Instead of immediately loading entire datasets, Virtual Arrays allow delayed data reading from disk until an explicit computation requests them. This approach greatly reduces memory consumption and enhances computational efficiency, allowing analyses to access significantly larger datasets interactively and responsively.
We will describe the design and usage of Virtual Arrays, illustrating how physicists can seamlessly integrate lazy data loading into existing workflows using Coffea—the Columnar Object Framework For Effective Analysis. Coffea enables efficient analysis of event data with columnar operations and transparently scales computations from personal laptops to large distributed infrastructures without modifying analysis code. Concrete high-energy physics examples, including selective data processing and histogramming, will highlight how Virtual Arrays substantially optimize analyses, significantly reducing the time-to-insight for data-intensive collider experiments.Speaker: Iason Krommydas (Rice University (US)) -
208
A Unified Interface for Different Memory Layouts
Based on previous experience with parallel event data processing, a Structure-of-Arrays (SoA) layout for objects frequently performs better than Array-of-Structures (AoS), especially on GPUs. However, AoS is widespread in existing code, and in C++, changing the data layout from AoS to SoA requires changing the data structure declarations and the access syntax. This work is repetitive, time consuming, and leads to less intuitive code. For example, with AoS, we can have an array of particles and access the third particle’s momentum with the syntax particles[2].x. In contrast, SoA requires the syntax particles.x[2].
This addresses a common problem: some LHC experiments have independently implemented an automatic AoS to SoA converter; CMS uses C++ preprocessor macros, while ATLAS developed a template meta-programming-based solution in ACTS R&D. Both implementations are challenging to maintain due to the code complexity, and they require the programmer to adopt an unfamiliar API.We present a generic solution for the abstraction of the data layout, enabling algorithm-specific memory layout optimization. Given a user-defined AoS data structure, the solution automatically generates an analogous SoA structure while keeping the access syntax identical to the C++ AoS style. This allows the programmer to change data layouts quickly without affecting the surrounding code because the interface is decoupled from the memory layout. Using modern C++ features (e.g., reflection), we offer an intuitive user interface while facilitating readability and maintainability of the code.
Speakers: Jolly Chen (CERN & University of Twente (NL)), Leonardo Beltrame (Politecnico di Milano (IT)), Dr Oliver Gregor Rietmann (CERN) -
209
RNTuple Attributes: a RNTuple-native metadata system
RNTuple is ROOT's next-generation columnar format, replacing TTree as the primary storage solution for LHC event data.
After 6 years of R&D, the RNTuple format reached the 1.0 milestone in 2024, promising full backward compatibility with any data written from that moment on.
Designing a new on-disk format does not only allow for significant improvements on file sizes and read/write speed, but it also gives the opportunity for introducing entirely new features that were missing in TTree.
One such feature is a built-in "attribute" system that allows metadata to be reliably associated to an RNTuple.
Compared to a bespoke user-level solution, built-in attributes are self-contained, self-describing and mergeable without requiring any external context (e.g. a software framework that knows how to associate and merge those attributes).
Furthermore, the API to interact with attributes can be designed to be intuitively useable by anyone familiar with the regular RNTuple API.
This work presents a prototype implementation of the RNTuple attribute system, alongside concrete examples of its use.
Open questions for further integration with the LHC experiments' frameworks are also discussed.Speaker: Giacomo Parolini (CERN)
-
205
-
Track 2: Data Analysis - Algorithms and Tools ESA B
ESA B
Conveners: chair: Thea Aarrestad, co-chair: David Rousseau-
210
Upgrade of the Belle II First-Level Neural Track Trigger by Three-Dimensional Hough Finding and Deep Neural Networks on FPGAs
In anticipation of higher luminosities at the Belle II experiment, high levels of beam background
from outside of the interaction region are expected. To prevent track trigger rates
from surpassing the limitations of the data acquisition system, an upgrade of the first-level
neural track trigger becomes indispensable. This upgrade contains a novel track finding
algorithm based on three-dimensional Hough transformations of the center wires in the
set of so-called track segments that are crossed by the particle tracks. These track segments
combine eleven close-by wires in hourglass shapes for both the axial and stereo wire
planes of the Central Drift Chamber of Belle II. Using this preselection algorithm, the
track segment information is preprocessed and passed to a deep neural network predicting
the vertex and the azimuthal and polar angles of a three-dimensional particle track.
With this setup, a minimum-bias single track trigger is expected to be considerably more
background resistant than the current implementation with two-dimensional track finding
and single hidden layer networks.
For the upgrade of the neural track trigger, a more powerful 4th generation Belle II
universal trigger FPGA board (“UT4”), compared to the presently used one (“UT3”), is
available. This avoids the time-consuming data transfer between the track finding and the
neural computation. Since both the track finding and neural network parts can now be
executed on the same FPGA board, the gained latency allows for the implementation of
deep neural networks, even with the possibility to include the complete wire input from the
track segments. Implementing in addition a classification output node, both the efficiency
and the background rejection of the neural track trigger can be increased significantly.
The upgraded neural track trigger will be commissioned at the end of 2025 and is planned
to run from the year 2026 onward.Speaker: Simon Hiesl -
211
Track fitting at the full LHC collision frequency: Design and performance of the GPU-based Kalman Filter at the LHCb experiment
The LHCb experiment at the Large Hadron Collider (LHC) operates a fully software-based trigger system that processes proton-proton collisions at a rate of 30 MHz, reconstructing both charged and neutral particles in real time. The first stage of this trigger system, running on approximately 500 GPU cards, performs a track pattern recognition to reconstruct particle trajectories with low latency.
Starting with the 2025 data-taking period, a novel approach has been introduced for precise track parameter estimation: a custom Kalman Filter, highly optimized for GPU execution, is now employed to fit tracks at the full collision rate of 30 MHz. This implementation leverages dedicated parametrizations of material interactions and the magnetic field to meet stringent throughput requirements.
This is the first time such a high-precision track fitting is performed at the full LHC collision frequency in any experiment. The result is a marked improvement in the momentum and mass resolution, increased robustness against detector misalignments, and a reduced rate of fake tracks. Moreover, this development represents a critical step toward future full event reconstruction at the LHC collision rate.
In this talk, we will outline the requirements for real-time track fitting in LHCb’s first-level trigger, detail the implementation of the Kalman Filter on GPUs, and present a comprehensive performance evaluation using the 2025 data set.Speaker: Lennart Uecker (Heidelberg University (DE)) -
212
Real-time monitoring of LHCb beam spot properties based on FPGA hit reconstruction
The upgraded LHCb experiment is pioneering the landscape of real-time data-processing techniques using an heterogeneous computing infrastructure, composed of both GPUs and FPGAs, aimed at boosting the performance of the HLT1 reconstruction. Amongst the novelties in the reconstruction infrastructure made for the Run 3, the introduction of a real-time VELO hit-finding FPGA-based architecture stands out. For the first time at any LHC experiment, the bidimensional clusters of active pixels on the silicon vertex detector are reconstructed before event-building, directly on the detector readout boards, at the full interaction rate of ~30MHz. In addition to saving HLT1 computing resources and reducing the DAQ bandwidth, the availability of well reconstructed particle hits at the readout level opens up the possibility of further processing in order to reconstruct even more complex quantities. Specifically, measuring hit rates at several positions on the detector sensors allows to measure and track the geometrical properties of the luminous region in real time. For such purpose, a set of programmable counters has been implemented in firmware. This set of counters uses minimal FPGA resources and it can be analysed to provide beam spot position, shape and inclination measurements. This is achieved via linearized computations based on principal component analysis (PCA), that are performed in real time on the LHCb slow control software. This method differs substantially from the usual techniques relying on track and vertex reconstruction, that are prone to misalignment biases and depend on the HLT running conditions. In this contribution, we describe the technical implementation of such a system for real-time beam-spot measurement, and report the results obtained with real data collected in 2024 and 2025.
Speaker: Giulio Cordova (Universita & INFN Pisa (IT)) -
213
Improving the CMS High Level Trigger tracking at the HL-LHC with novel and evolved heterogeneous algorithms
Charged particle track reconstruction is one the heaviest computational tasks in the event reconstruction chain at Large Hadron Collider (LHC) experiments. Furthermore, projections for the High Luminosity LHC (HL-LHC) show that the required computing resources for single-threaded CPU algorithms will exceed those that are expected to be available. It follows that experiments at the HL-LHC will need to employ novel and evolved track reconstruction algorithms, within heterogeneous computing systems that include many-core CPUs as well as GPUs, in the attempt to maximize the computational performance while retaining the best possible reconstruction efficiency. In the context of the CMS High Level Trigger (HLT) at the HL-LHC, the mkFit algorithm, already in use for the CMS track reconstruction during the LHC Run 3, will exploit its parallelized and vectorized nature on CPUs to perform pattern recognition using seed tracks produced with algorithms that are designed to be fully parallelizable and hardware agnostic, thus suitable for heterogeneous systems: the Patatrack and the Line Segment Tracking (LST) algorithms. Patatrack is an established algorithm, already used for the CMS pixel track reconstruction at HLT during the Run 3 of the LHC, while LST is a novel algorithm, recently integrated in the CMS software, targeting the reconstruction of tracks in the outer tracker of the HL-LHC CMS detector. The state-of-the-art performance for the CMS HLT track reconstruction at the HL-LHC is presented, obtained using the combination of the mkFit, Patatrack and LST algorithms, that in turn use ML techniques such as deep neural networks and multi-objective particle swarm optimization to suppress duplicate and misconstructed tracks. Prospects of further improvements are also presented, with a focus on the usage of ML techniques for track reconstruction at CMS.
Speakers: CMS Collaboration, Mario Masciovecchio (Univ. of California San Diego (US)) -
214
Deep Learning for Primary Vertex Identification in the ATLAS Experiment
The exponential time scaling of traditional primary vertex reconstruction algorithms raises significant performance concerns for future high-pileup environments, particularly with the upcoming High Luminosity upgrade to the Large Hadron Collider. In this talk, we introduce PV-Finder, a deep learning-based approach that leverages reconstructed track parameters to directly predict primary vertex positions and track-to-vertex associations. Primary Vertex identification is achieved using a multi-layer perceptron (MLP) that converts track data into one-dimensional probability distributions, known as kernel density estimations (KDEs). These KDEs then serve as inputs to a convolutional neural network (CNN), specifically utilizing UNet and UNet++ architectures, to refine vertex position predictions. More recently, we have also explored the integration of graph neural networks (GNNs) to enhance track-vertex association. Preliminary results demonstrate that PV-Finder offers improved vertex reconstruction efficiency and accuracy, making it a compelling alternative to traditional methods.
Speakers: Qi Bin Lei (Stanford University (US)), Rocky Bala Garg (Stanford University (US))
-
210
-
Track 3: Computations in Theoretical Physics: Techniques and Methods ESA C
ESA C
Conveners: chair: Chiara Signorile, co-chair: Enrico Bothmann-
215
Boosting Monte Carlo performance: GPU offloading of matrix elements and composable deep learning for phase space sampling
We discuss recent developments in performance improvements for Monte Carlo integration and event sampling. (1) Massive parallelization of matrix element evaluations based on a new back end for the matrix element generator O'Mega targeting GPUs. This has already been integrated in a development version of the Monte Carlo event generator Whizard for realistic testing and profiling. (2) A complete reconstruction of adaptive multi-channel phase space sampling out of composable parametrized lenses (as defined in category theory and functional programming) for deep learning. We report on two experimental implementations: one in object oriented Fortran, which will be integrated in into Whizard in the future, and another functional programming implementation in ocaml, that is used for cross checks and conceptual developments. Finally, we also report on progress towards a numerically stable inclusion of NLL ISR effects in simulations for future $e^+e^-$ Higgs factories like FCC-ee or ILC/LCF, the extension of methods for fast integrating NLO automated processes to pp colliders and NLO EW corrections, as well as current work on NLO processes in Whizard for EFTs and BSM
models.Speakers: Pia Bredt, Juergen Reuter (DESY Hamburg, Germany) -
216
Automated Selection for Physical Models of Small-Angle Neutron Scattering
To characterize the structures and properties of samples in the analysis of experimental data of Small-Angle Neutron Scattering (SANS), a physical model must be selected corresponding to each sample for iterative fitting. However, the conventional method of model selection is primarily based on manual experience, which has a high threshold for analysis and low accuracy. Furthermore, the automated selection of physical models based on standard neural networks face challenges such as the lack of local image features, large intra-class differences, and small inter-class differences. This paper proposes a Bimodal Feature Fusion Convolutional Neural Network (BFF-CNN) model to mitigate these issues. Initially, a physically informed Fourier-Bessel Transform (FBT) is deployed to extract global structural information from scattering images. Then, the original and FBT-transformed images are fed into two subnetworks for feature extraction and fusion, enhancing the overall feature representation capability of the neural network. A Restricted Softmax (R-Softmax) loss function is implemented, adding a penalty term to the original Softmax loss function for limiting the probability of input samples being assigned to incorrect classes. This alleviates the vanishing gradient problem when the Softmax loss approaches zero, thereby improving the convergence speed. Experimental results obtained using a self-built SANS image dataset show that the BFF-CNN significantly improves the prediction accuracy and average recall as compared to models such as the Residual Network (ResNet)-18 and PMG. Using the joint learning strategy of R-Softmax and center loss functions, the prediction accuracy and recall has improved by 5.4 and 10.5 percentage points, respectively, as compared to the case using only the Softmax loss function, demonstrating good classification performance for SANS data.
Speaker: 李亚康 liyk -
217
Repurposing Large Language Models
Foundation models are a very successful approach to linguistic tasks. Naturally, there is the desire to develop foundation models for physics data. Currently, existing networks are much smaller than publicly available Large Language Models (LLMs), the latter having typically billions of parameters. By applying pretrained LLMs in an unconventional way, we introduce large networks for cosmological data with a relatively cheap training cost.
Speaker: Daniel Schiller (Institute for Theoretical Physics Heidelberg) -
218
BitHEP: The Limits of Low-Precision ML in HEP
The increasing complexity of modern neural network architectures demands fast and memory-efficient implementations to mitigate computational bottlenecks. In this work, we evaluate the recently proposed BitNet architecture in HEP applications, assessing its performance in classification, regression, and generative modeling tasks. Specifically, we investigate its suitability for quark-gluon discrimination, SMEFT parameter estimation, and detector simulation, comparing its efficiency and accuracy to state-of-the-art methods. Our results show that while BitNet consistently performs competitively in classification tasks, its performance in regression and generation varies with the size and type of the network, highlighting key limitations and potential areas for improvement.
Speaker: Dr Ramon Winterhalder (Università degli Studi di Milano) -
219
FeynGraph - A Modern High-Performance Feynman Diagram Generator
We present FeynGraph, a modern high-performance Feynman diagram generator designed to integrate seamlessly with modern computational workflows to calculate scattering amplitudes. FeynGraph is designed as a high-performance Rust library with easy-to-use Python bindings, allowing it to be readily used in other tools. With additional features like arbitrary custom diagram selection filters and automatic diagram drawing, FeynGraph strives to be a fully-featured Feynman diagram toolkit at any loop order.
Speaker: Jens Braun (Karlsruhe Institute of Technology (KIT))
-
215
-
Poster session with coffee break: Group 2 ESA W 'West Wing'
ESA W 'West Wing'
-
Track 1: Computing Technology for Physics Research ESA M
ESA M
Conveners: chair: Fazhi Qi, co-chair: Nicholas Smith-
220
Autoregressive Models for the Fast Calorimeter Simulation of the ATLAS Calorimeter
For high-energy physics experiments, the generation of Monte Carlo events, and particularly the simulation of the detector response, is a very computationally intensive process. In many cases, the primary bottleneck in detector simulation is the detailed simulation of the electromagnetic and hadronic showers in the calorimeter system.
ATLAS is currently using its state-of-the-art fast simulation tool AtlFast3, which employs a combination of histogram-based parameterizations and Generative Adversarial Networks (GANs) to provide a highly efficient yet accurate simulation of the full detector response.
Motivated by the Fast Calorimeter Simulation Challenge, which concluded with a community paper demonstrating the superiority of modern generative models — such as diffusion models, transformers and (continuous) normalizing flows — over more traditional approaches like GANs and variational autoencoders, the applicability of these next-generation techniques to the ATLAS fast calorimeter simulation was explored.
In this talk, first physics performance results of these novel models are presented. The models are trained on a newly generated input dataset with extended pseudorapidity coverage and optimized granularity that allows to reproduce the detailed simulation with a reduced number of voxels.Speaker: Florian Ernst (Heidelberg University (DE)) -
221
Cross-Geometry Fast Electromagnetic Shower Simulation
The accurate simulation of particle showers in collider detectors remains a critical bottleneck for high-energy physics research. Current approaches face fundamental limitations in scalability when modeling the complete shower development process.
Deep generative models offer a promising alternative, potentially reducing simulation costs by orders of magnitude. This capability becomes increasingly vital as upcoming particle physics experiments are expected to produce unprecedented volumes of data.
We present a novel domain adaptation framework employing state-of-the-art deep generative models to generate high-fidelity point-cloud representations of electromagnetic particle showers. Using transfer learning techniques, our approach adapts simulations across diverse electromagnetic calorimeter geometries with exceptional data efficiency, thereby reducing training requirements and eliminating the need for a fixed-grid structure.
The results demonstrate that our method can achieve high accuracy while significantly reducing data and computational demands, offering a scalable solution for next-generation particle physics simulations.
We also investigate the stability of generative models under iterative training, a process in which models are retrained on their own generated data. While model collapse has been observed in large language models and variational autoencoders for natural image generation, its implications for high-energy physics remain unexplored. We study this phenomenon in the context of particle shower simulation, using normalizing flows and diffusion models.
Speaker: Lorenzo Valente (University of Hamburg (DE)) -
222
Celeritas: GPU-Accelerated Detector Simulation
Next-generation High Energy Physics (HEP) experiments face unprecedented computational demands. The High-Luminosity Large Hadron Collider anticipates data processing needs that will exceed available resources, while intensity frontier experiments such as DUNE and LZ are dominated by the simulation of high-multiplicity optical photon events. Heterogeneous architectures, particularly GPU acceleration, offer a path to meet these challenges by reducing both compute time and power consumption. Celeritas is a GPU-optimized particle transport code designed for high-performance computing (HPC) environments. It supports cross-platform execution (CPU, CUDA, HIP) with reproducible results and has demonstrated significant speedups in electromagnetic shower simulations. For efficient GPU-based geometry traversal, Celeritas integrates the VecGeom and ORANGE libraries and is currently transitioning from volume-based to surface-based models. Generic detector geometries can be imported from Geant4 or GDML files and are handled through VecGeom or ORANGE, both of which are optimized for GPU performance. Optical photons from Cherenkov, scintillation, and custom sources are handled in a separate GPU stepping loop. Benchmarks show up to 150x speedup (CPU vs GPU) in photon generation, and up to 350x speedup for generation and tracing to boundaries in simple geometries (without physics interactions). Volumetric interactions, including wavelength shifting, are implemented and surface-level optical properties are in development. We will present Celeritas' performance, capabilities, and utility for detector simulation in HEP experiments, and discuss the status of ongoing integration efforts with ATLAS, CMS, and LZ.
Speaker: Hayden Richard Hollenbeck (University of Virginia (US)) -
223
AdePT – Offloading Electromagnetic Showers in Geant4 Simulations to GPUs
The simulation throughput of LHC experiments is increasingly limited by detector complexity in the high-luminosity phase. As high-performance computing shifts toward heterogeneous architectures such as GPUs, accelerating Geant4 particle transport simulations by offloading parts of the workload to GPUs can improve performance. The AdePT plugin currently offloads electromagnetic showers in Geant4 simulations to GPUs, making an efficient CPU–GPU workflow essential. In this contribution, we present state-of-the-art detector simulations for LHC experiments leveraging GPU acceleration, and report on the integration of AdePT into the software frameworks of the LHC experiments. The remaining challenges and future directions are also discussed.
Speaker: Severin Diederichs (CERN)
-
220
-
Track 2: Data Analysis - Algorithms and Tools ESA B
ESA B
Conveners: chair: Thea Aarrestad, co-chair: Tilman Plehn-
224
Generative Unfolding in Many Dimensions
Unfolding detector-level data into meaningful particle-level distributions remains a key challenge in collider physics, especially as the dimensionality of the relevant observables increases. Traditional unfolding techniques often struggle with such high-dimensional problems, motivating the development of machine learning-based approaches.We introduce a new method for generative unfolding that is designed to handle many variables simultaneously, incorporating state-of-the-art model design choices.
Speaker: Antoine Petitjean (ITP, Universität Heidelberg) -
225
Full Generative Unfolding
Two shortcomings of classical unfolding algorithms, namely that they are defined on binned, one-dimensional observables, can be overcome when using generative machine learning. Many studies on generative unfolding reduce the problem to correcting for detector smearing, however a full unfolding pipeline must also account for background, acceptance and efficiency effects. To fully integrate generative unfolding into existing analysis pipelines at the LHC, we develop solutions for these crucial but often overlooked aspects.
Speaker: Sofia Palacios Schweitzer (ITP, University Heidelberg) -
226
Bridging the Gap Between Unfolding and Quantification Learning
Measured distributions are usually distorted by a finite resolution of the detector. Within physics research, the necessary correction of these distortions is know as Unfolding. Machine learning research uses a different term for this very task: Quantification Learning. For the past two decades, this difference in terminology - together with several differences in notation - have prevented computer scientists and physicists from acknowledging the fact that Unfolding and Quantification Learning indeed cover the same mathematical problem.
In this talk, I will bridge the gap between these two branches of literature and I will provide an overview of the numerous key results that Quantification Learning has produced over the past two decades, covering statistical consistency, the anatomy of reconstruction errors, improved optimization techniques, more informative data representations, and arbitrary numbers of observable quantities. Each of these results has immediate and compelling implications on the practice of Unfolding, tackling questions like: Which algorithms produce trustworthy results and which ones don't? How can we increase their performance and how should we implement them? How much data do we need from which source? Which of the current limits in Unfolding are inherent and which can be lifted? We will discuss these questions from an interdisciplinary perspective, taking into account recent developments from both physics and machine learning research.
Speaker: Dr Mirko Bunse (Lamarr Institute for Machine Learning and Artificial Intelligence, Dortmund, Germany) -
227
Efficient bin by bin profile likelihood minimization for precision measurements
The High-Luminosity LHC era will deliver unprecedented data volumes, enabling measurements on fine-grained multidimensional histograms containing millions of bins with thousands of events each. Achieving ultimate precision requires modeling thousands of systematic uncertainty sources, creating computational challenges for likelihood minimization and parameter extraction. Fast minimization is crucial for efficient analysis development.
We present a novel tensorflow-based tool, CombineTF2, that leverages optimized parallelization on CPUs and GPUs for this task. Our implementation interfaces with boost histograms, supporting flexible likelihood configurations with symmetrization options to establish Gaussian approximations. The minimization utilizes automatic differentiation to compute (quasi) second-order derivatives, yielding robust and efficient results. We further provide analytic proof of deterministic solutions within linear approximation limits.
Our tool distinctly focuses on measuring physical observables rather than intrinsic parameters, disentangling likelihood parameterization from quantities of interest and creating a more intuitive, less error prone user experience. Comprehensive benchmarking demonstrates excellent scaling with increased threading and reveals significant efficiency gaps when compared to commonly used frameworks in the field. These performance differences highlight the need for continued development of optimized statistical tools for high-energy physics analyses.
Speaker: David Walter (Massachusetts Inst. of Technology (US))
-
224
-
Track 3: Computations in Theoretical Physics: Techniques and Methods ESA C
ESA C
Conveners: chair: Timo Janßen, co-chair: Chiara Signorile-
228
Monte Carlo challenges for Non Perturbative QED tests
Non perturbative QED is used in calculations of Schwinger pair creation, in precision QED tests with ultra-intense lasers and to predict beam backgrounds at the interaction point of colliders. In order to predict these phenomena, custom built monte carlo event generators based on a suitable non perturbative theory have to be developed. One such suitable theory uses the Furry Interaction Picture, in which a background field is taken into account non perturbatively at Lagrangian level. This theory is precise, but the transition probabilities are in general, complicated. These non perturbative processes predict unique phenomenology which will be tested in upcoming experiments. This poses a challenge for the monte carlo which struggles to implement the theory computationaly. We examine theoretical tricks to simplify the analytic form of these non perturbative transition rates. We also examine monte carlo techniques which can accurately simulate this phenomenology, including higher order processes, in a custom built program called IPstrong.
Speaker: Dr Anthony Hartin (LMU) -
229
The $g^6$ pressure of hot Yang-Mills theory: Canonical form of the integrand
The determination of the hot QCD pressure has a long history, and has -- due to its phenomenological relevance in cosmology, astrophysics and heavy-ion collisions -- spawned a number of important theoretical advances in perturbative thermal field theory applicable to equilibrium thermodynamics.
We present major progress towards the determination of the last missing piece for the pressure of a Yang-Mills plasma at high temperatures at order $g^6$ in the strong coupling constant. This order is of key importance due to its role in resolving the long-standing infrared problem of finite-temperature field theory within a dimensionally reduced effective field theory setup.
By systematically applying linear transformations of integration variables, or momentum shifts, we resolve equivalences between different representations of Feynman sum-integrals on the integrand level, transforming those into a canonical form. At the order $g^6$, this results in reducing a sum of O(100000) distinct sum-integrals which are produced from all four-loop vacuum diagrams down to merely 21. Furthermore, we succeed to map 11 of those onto known lower-loop structures. This leaves only 10 genuine 4-loop sum-integrals to be evaluated, thereby bringing the finalization of three decades of theoretical efforts within reach.
Speaker: York Schröder . -
230
String theory mathematics and matrix data analysis.
Random matrix theory has a long history of applications in the study of eigenvalue distributions arising in diverse real-world ensembles of matrix data. Matrix models also play a central role in theoretical particle physics, providing tractable mathematical models of gauge-string duality, and allowing the computation of correlators of invariant observables in physically interesting sectors of the AdS/CFT correspondence. A recent development is the definition of the general 13-parameter permutation invariant Gaussian matrix models and the computation of expectation values of permutation-invariant polynomials in these models. This was motivated by applications to the statistics of ensembles of generic real matrices arising from natural language processing and computational linguistics. For symmetric matrices with constant diagonals, such as arising in statistical finance, the general 4-parameter models have been similarly developed. Statistical tasks of symmetry-based data reduction, anomaly detection and similarity measurement for real-world entities represented by matrices have been successfully performed, by using the Gaussian models to test approximate Gaussianity and by quantifying the fine structure of departures from Gaussianity. Some of the matrix data analysed was generated by neural network methods. These applications have the potential to be extended into other areas of matrix data analysis, e.g. in collider particle physics.
Speaker: Sanjaye Ramgoolam (Queen Mary University of London) -
231
Bridging Gravitational Wave and High-Energy Physics Software
Gravitational Wave (GW) Physics has entered a new era of Multi-Messenger Astronomy (MMA), characterized by increasing detections from GW observatories such as the LIGO, Virgo, and KAGRA collaborations. This presentation will introduce the KAGRA experiment, outlining the current workflow from data collection to physics interpretation, and demonstrate the transformative role of machine learning (ML) in GW data analysis.
This talk also bridges advancements in computational techniques between fundamental research in Astrophysics and High-Energy Physics (HEP). Innovative solutions for addressing next-generation data analysis challenges will be presented, with a focus on the use of modern ML tools within the ROOT C++ Framework (CERN) and introducing Anaconda HEP-Forge for rapid software deployments. These tools, available as simple libraries, integrate key requirements for typical astrophysical analysis — such as vector manipulation, KAFKA & other Cloud data transfers, and complex tensor computations—enabling efficient ML training & inference on both CPU and GPU technologies.
Speaker: Prof. Marco Meyer-Conde (Tokyo City University (JP), University Of Illinois (US))
-
228
-
-
-
Plenary ESA A
ESA A
Conveners: chair: Jennifer Ngadiuba, co-chair: Chiara Signorile- 232
-
233
CaloDiT and CaloDiM: Fast, Accurate and Adaptable Generative Models for Calorimeter Shower Simulation
The need for fast calorimeter shower simulation tools has spurred the development of numerous surrogate approaches based on deep generative models. While these models offer significant reductions in compute times with respect to traditional Monte Carlo methods, their development consumes significant amounts of time, manpower and computing resources.
In order to reduce the time to design a model for a new detector geometry, we present two generative models. The first, CaloDiT, is a transformer-based diffusion model, while the second, CaloDiM, is a diffusion model based around mixers. We leverage a foundation model approach, whereby information gained by training a model across multiple detector geometries is used to accelerate the adaptation of the model to a new, unseen geometry. We will demonstrate the robust generalisation capabilities of the model, which can achieve competitive physics performance while requiring substantially less training time and data than training from scratch. We will also describe how the model can be used from an example directly in the Geant4 simulation toolkit, as well as for DD4hep geometries in the Key4hep framework via the DDFastShowerML library.
Speaker: Peter McKeown (CERN) -
234
Poster prizes
-
Poster session with coffee break: Group 2 ESA W 'West Wing'
ESA W 'West Wing'
-
Plenary ESA A
ESA A
Convener: chair: David Britton-
235
Track 1 SummarySpeaker: Nick Smith (Fermi National Accelerator Lab. (US))
- 236
-
237
Track 3 summarySpeaker: Tianji Cai
-
238
Closing remarksSpeaker: David Britton (University of Glasgow (GB))
-
235
-