The High Energy cosmic-Radiation Detection (HERD) facility is a space astronomy and particle astrophysics experiment planned to be installed on the China Space Station. HERD is a China-led mission with Italy leading key European contributions. Its primary scientific goals include detecting dark matter in cosmic space, precisely measuring the energy spectrum and composition of cosmic rays, and...
Modern beam telescopes play a crucial role in high-energy physics experiments to precisely track particle interactions. Accurate alignment of detector elements in real-time is essential to maintain the integrity of reconstructed particle trajectories, especially in high-rate environments like the ATLAS experiment at the Large Hadron Collider (LHC). Any misalignment in the detector geometry can...
Awkward Array provides efficient handling of large, irregular data structures in Python, playing a key role in high-energy physics analysis. This work presents ongoing efforts to optimize Awkward Arrays for GPUs using CUDA, aiming to achieve performance parity with or surpass CPU kernel implementations. Key improvements focus on optimized memory management, leveraging CUDA-specific features,...
The design of modern high-energy physics detectors is a highly intricate process, aiming to maximize their physics potential while balancing various manufacturing constraints. As detectors become larger and more sophisticated, it becomes increasingly difficult to maintain a comprehensive understanding of the entire system. To address this challenge, we aim to translate the design process into...
Data analysts working with large datasets require absolute certainty that each file is processed exactly once. ServiceX addresses this challenge by using well established transaction processing architectures. This system implements a fully transactional workflow powered by PostgreSQL and RabbitMQ, ensuring data integrity throughout the processing pipeline. This presentation details both the...
Many physics analyses using the CMS detector at the LHC require accurate, high resolution electron and photon energy measurements. The CMS electromagnetic calorimeter (ECAL) is a fundamental component of these analyses. The excellent resolution of ECAL was of central importance to the discovery of the Higgs boson in 2012, and is being used for increasingly precise measurements of Higgs boson...
Workflow Management Systems (WMSs) are essential tools for structuring an arbitrary sequence of tasks in a clear, maintainable, and repeatable way. The popular Python-based WMS luigi helps building complex workflows. It handles task dependency resolution and input/output tracking as well providing a simple workflow visualisation and a convenient command-line integration.
The extension...
The CMS Experiment had to manage and process data volumes approaching the exascale during the LHC Run 3. This required a seamless synergy between the workload and data management systems, namely WMCore and Rucio. Following up to the integration of Rucio into the CMS infrastructure, the workload management system has undergone substantial adaptations to harness new data management capabilities...
Particle physics is a field hungry for high quality simulation, to match the precision with which data is gathered at collider experiments such as the Large Hadron Collider (LHC). The computational demands of full detector simulation often lead to the use of faster but less realistic parameterizations, potentially compromising the sensitivity, generalizability, and robustness of downstream...
QED in the classical Coulomb field of a nucleus serves as a good approximation for obtaining high-precision results in atomic physics. The external field is a subject to radiative corrections. These corrections can be explained perturbatively in terms of the QED Feynman diagrams containing scalar-like propagators for the external field together with the usual QED propagators.
A calculation...
Particle identification (PID) plays a crucial role in particle physics experiments. A groundbreaking advancement in PID involves cluster counting (dN/dx), which measures primary ionizations along a particle’s trajectory within a pixelated time projection chamber (TPC), as opposed to conventional dE/dx measurements. A pixelated TPC with a pixel size of 0.5x0.5 mm2 has been proposed as the...
Multiple visualization methods have been implemented in the Jiangmen Underground Neutrino Observatory (JUNO) experiment and its satellite experiment JUNO-TAO. These methods include event display software developed based on ROOT and Unity. The former is developed based on the JUNO offline software system and ROOT EVE, which provides an intuitive way for users to observe the detector geometry,...
Detector and event visualization software is essential for modern high-energy physics (HEP) experiment. It plays important role in the whole life circle of any HEP experiment, from detector design, simulation, reconstruction, detector construction and installation, to data quality monitoring, physics data analysis, education and outreach. In this talk, we will discuss two frameworks and their...
Simulation-based inference (SBI) is a set of statistical inference approaches in which Machine Learning (ML) algorithms are trained to approximate likelihood ratios. It has been shown to provide an alternative to the likelihood fits commonly performed in HEP analyses. SBI is particularly attractive in analyses performed over many dimensions, in which binning data would be computationally...
We present an end-to-end track reconstruction algorithm based on Graph Neural Networks (GNNs) for the main drift chamber of the BESIII experiment at the BEPCII collider. The algorithm directly processes detector hits as input to simultaneously predict the number of track candidates and their kinematic properties in each event. By incorporating physical constraints into the model, the...
Statistical analyses in high energy physics often rely on likelihood functions of binned data. These likelihood functions can then be used for the calculation of test statistics in order to assess the statistical significance of a measurement.
evermore is a python package for building and evaluating these likelihood functions using JAX – a powerful python library for high performance...
The availability of precise and accurate simulation is a limiting factor for interpreting and forecasting data in many fields of science and engineering. Often, one or more distinct simulation software applications are developed, each with a relative advantage in accuracy or speed. The quality of insights extracted from the data stands to increase if the accuracy of faster, more economical...
The development of radiation-hard CMOS Monolithic Active Pixel Sensors (MAPS) is a key advancement for next-generation high-energy physics experiments. These sensors offer improved spatial resolution and integration capabilities but require efficient digital readout and data acquisition (DAQ) systems to operate in high-radiation environments. My research focuses on the FPGA-based digital...
Precise simulation-to-data corrections, encapsulated in scale factors, are crucial for achieving high precision in physics measurements at the CMS experiment. Traditional methods often rely on binned approaches, which limit the exploitation of available information and require a time-consuming fitting process repeated for each bin. This work presents a novel approach utilizing modern...
As generative models start taking an increasingly prominent role in both particle physics and everyday life, quantifying the statistical power and expressiveness of such generative models becomes a more and more pressing question.
In past work, we have seen that a generative mode can, in fact, be used to generate samples beyond the initial training data. However, the exact quantification...
The simulation of particle interactions with detectors plays a critical role in understanding the detector performances and optimizing physics analysis. Without the guidance of the first-principle theory, the current state-of-the-art simulation tool, \textsc{Geant4}, exploits phenomenology-inspired parametric models, which must be combined and carefully tuned to experimental observations. The...
Track reconstruction is one of the most important and challenging tasks in the offline data processing of collider experiments. The Super Tau-Charm Facility (STCF) is a next-generation electron-positron collider running in the tau-charm energy region proposed in China, where conventional track reconstruction methods face enormous challenges from the higher background environment introduced by...
The CMS experiment requires massive computational resources to efficiently process the large amounts of detector data. The computational demands are expected to grow as the LHC enters the high-luminosity era. Therefore, GPUs will play a crucial role in accelerating these workloads. The approach currently used in the CMS software (CMSSW) submits individual tasks to the GPU execution queues,...
The CMS experiment requires massive computational resources to efficiently process the large amounts of detector data. The computational demands are expected to grow as the LHC enters the high-luminosity era. Therefore, GPUs will play a crucial role in accelerating these workloads. The approach currently used in the CMS software (CMSSW) relies on explicit memory management techniques, where...
Precision measurements of particle properties, such as the leading hadronic contribution to the muon magnetic moment anomaly, offer critical tests of the Standard Model and probes for new physics. The MUonE experiment aims to achieve this through precise reconstruction of muon-electron elastic scattering events using silicon strip tracking stations and low-Z targets, while accounting for...
Machine learning (ML), a cornerstone of data science and statistical analysis, autonomously constructs hierarchical mathematical models—such as deep neural networks—to extract complex patterns and relationships from data without explicit programming. This capability enables accurate predictions and the extraction of critical insights, making ML a transformative tool across scientific...
Machine Learning (ML) plays an important role in physics analysis in High Energy Physics. To achieve better physics performance, physicists are training larger and larger models with larger dataset. Therefore, many workflow developments focus on distributed training of large ML models, inventing techniques like model pipeline parallelism. However, not all physics analyses need to train large...
The calibration of Belle II data involves two key processes: prompt
calibration and reprocessing. Prompt calibration represents the initial
step in continuously deriving calibration constants in a timely manner for
the data collected over the previous couple of weeks. Currently, this process
is managed by b2cal, a Python-based plugin built on Apache
Airflow to handle calibration jobs....
Cluster counting is a highly promising particle identification technique for drift chambers in particle physics experiments. In this paper, we trained neural network models, including a Long Short-Term Memory (LSTM) model for the peak-finding algorithm and a Convolutional Neural Network (CNN) model for the clusterization algorithm, using various hyperparameters such as loss functions,...
Weakly supervised anomaly detection has been shown to find new physics with a high significance at low injected signal cross sections. If the right features and a robust classifier architecture are chosen, these methods are sensitive to a very broad class of signal models. However, choosing the right features and classification architecture in a model-agnostic way is a difficult task as the...
Visualizing pre-binned histograms is a HEP domain specific concern which is not adequately supported within the greater pythonic ecosystem. In recent years, [mplhep][1] has emerged as a leading package providing this basic functionality in a user-friendly interface. It also supplies styling templates for the four big LHC experiments - ATLAS, CMS, LHCb, and ALICE. At the same time, the...
In a physics data analysis, "fake" or non-prompt backgrounds refer to events that would not typically satisfy the selection criteria for a given signal region, but are nonetheless accepted due to misreconstructed particles. This can occur, for example, when particles from secondary decays are incorrectly identified as originating from the hard scatter interaction point (resulting in non-prompt...
At many Worldwide LHC Computing Grid (WLCG) sites, HPC resources are already integrated, or will be integrated in the near future, into the experiment specific workflows. The integration can be done either in an opportunistic way to use otherwise unused resources for a limited period of time, or in a permanent way. The WLCG ATLAS Tier-2 cluster in Freiburg has been extended in both ways:...
The High-Level Trigger (HLT) of the Compact Muon Solenoid (CMS) processes event data in real time, applying selection criteria to reduce the data rate from hundreds of kHz to around 5 kHz for raw data offline storage. Efficient lossless compression algorithms, such as LZMA and ZSTD, are essential in minimizing these storage requirements while maintaining easy access for subsequent analysis....
The CMS experiment at the LHC has entered a new phase in real-time data analysis with the deployment of two complementary unsupervised anomaly detection algorithms during Run 3 data-taking. Both algorithms aim to enhance the discovery potential for new physics by enabling model-independent event selection directly at the hardware trigger level, operating at the 40 MHz LHC collision rate within...
The Analysis Grand Challenge (AGC) showcases an example of HEP analysis. Its reference implementation uses modern Python packages to realize the main steps, from data access to statistical model building and fitting. The packages used for data handling and processing (coffea, uproot, awkward-array) have recently undergone a series of performance optimizations.
While not being part of the HEP...
In recent years, Awkward Array, Uproot, and related packages have become the go-to solutions for performing High-Energy Physics (HEP) analyses. Their development is driven by user experience and feedback, with the community actively shaping their evolution. User requests for new features and functionality play a pivotal role in guiding these projects.
For example, the Awkward development...
The simulation of calorimeter showers is computationally expensive, leading to the development of generative models as an alternative. Many of these models face challenges in balancing generation quality and speed. A key issue damaging the simulation quality is the inaccurate modeling of distribution tails. Normalizing flow (NF) models offer a trade-off between accuracy and speed, making them...
The HEP communities have developed an increasing interest in High Performance Computing (HPC) centres, as these hold the potential of providing significant computing resources to the current and future experiments. At the Large Hadron Collider (LHC), the ATLAS and CMS experiments are challenged with a scale up of several factors in computing for the High Luminosity LHC (HL-HLC) Run 4,...
The HEP communities have developed an increasing interest in High Performance Computing (HPC) centres, as these hold the potential of providing significant computing resources to the current and future experiments. At the Large Hadron Collider (LHC), the ATLAS and CMS experiments are challenged with a scale up of several factors in computing for the High Luminosity LHC (HL-HLC) Run 4,...
The High Luminosity Large Hadron Collider (HL-LHC) and future big science experiments will generate unprecedented volumes of data, necessitating new approaches to physics analysis infrastructure. We present the SubMIT Physics Analysis Facility, an implementation of the emerging Analysis Facilities (AF) concept at MIT. Our solution combines high-throughput computing capabilities with modern...
The JUNO offline software (JUNOSW) is built upon the SNiPER framework. Its multithreaded extension, MT-SNiPER, enables inter-event parallel processing and has successfully facilitated JUNOSW's parallelization. Over the past year, two rounds of JUNO Data Challenge (DC) have been conducted to validate the complete data processing chain. During these DC tasks, the performance of MT-SNiPER was...
Jet tagging, i. e. determining the origin of high-energy hadronic jets, is a key challenge in particle physics. Jets are ubiquitous observables in collider experiments, made of complex collections of particles, that need to be classified. Over the past decade, machine learning-based classifiers have greatly enhanced our jet tagging capabilities, with increasingly sophisticated models driving...
Identifying products of ultrarelativistic collisions delivered by the LHC and RHIC colliders is one of the crucial objectives of experiments such as ALICE and STAR, which are specifically designed for this task. They allow for a precise Particle Identification (PID) over a broad momentum range.
Traditionally, PID methods rely on hand-crafted selections, which compare the recorded signal of...
Patrick Asenov (Universita & INFN Pisa (IT)), Anna Driutti (Universita & INFN Pisa (IT)), Mateusz Jacek Goncerz (Polish Academy of Sciences (PL)), Emma Hess (Universita & INFN Pisa (IT)), Marcin Kucharczyk (Polish Academy of Sciences (PL)), Damian Mizera (Cracow University of Technology (PL)), Marcin Wolter (Polish Academy of Sciences (PL)), Milosz Zdybal (Polish Academy of Sciences...
Searches for long-lived particles (LLPs) have attracted much interest lately due to their high discovery potential in the LHC Run-3. Signatures featuring LLPs with long lifetimes and decaying inside the muon detectors of the CMS experiment at CERN are of particular interest. In this talk, we will describe a novel Level-1 trigger algorithm that significantly improves CMS's signal efficiency for...
With the consequences of global warming becoming abundantly clear, physics research needs to do its part in becoming more sustainable, including its computing aspects. Many measures in this field expend great effort to keep the impact on users minimal. However, even greater savings can be gained when compromising on these expectations.
In any such approach affecting the user experience, the...
The Belle~II electromagnetic calorimeter (ECL) is not only used for measuring electromagnetic particles but also for identifying and determining the position of hadrons, particularly neutral hadrons.
Recent data-taking periods have presented two challenges for the current clustering method:
Firstly, the record-breaking luminosities achieved by the SuperKEKB accelerator have increased...
Accurate and efficient predictions of scattering amplitudes are essential for precision studies in high-energy physics, particularly for multi-jet processes at collider experiments. In this work, we introduce a novel neural network architecture designed to predict amplitudes for multi-jet events. The model leverages the Catani–Seymour factorization scheme and uses MadGraph to compute...
Track reconstruction is a cornerstone of modern collider experiments, and the HL-LHC ITk upgrade for ATLAS poses new challenges with its increased silicon hit clusters and strict throughput requirements. Deep learning approaches compare favorably with traditional combinatorial ones — as shown by the GNN4ITk project, a geometric learning tracking pipeline that achieves competitive physics...
We present an MLOps approach for managing the end-to-end lifecycle of machine learning algorithms deployed on FPGAs in the CMS Level-1 Trigger (L1T). The primary goal of the pipeline is to respond to evolving detector conditions by automatically acquiring up-to-date training data, retraining and re-optimising the model, validating performance, synthesising firmware, and deploying validated...
Precision measurements of Higgs, W, and Z bosons at future lepton colliders demand jet energy reconstruction with unprecedented accuracy. The particle flow approach has proven to be an effective method for achieving the required jet energy resolution. We present CyberPFA, a particle flow algorithm specifically optimized for the particle-flow-oriented crystal bar electromagnetic calorimeter...
Direct simulation of multi-parton QCD processes at full-color accuracy is computationally expensive, making it often impractical for large-scale LHC studies. A two-step approach has recently been proposed to address this: events are first generated using a fast leading-color approximation and reweighted to full-color accuracy. We build upon this strategy by introducing a machine-learning...
The high-luminosity environment at Belle II leads to growing beam-induced background, posing major challenges for the Belle II Level-1 (L1) trigger system. To maintain trigger rates within hardware constraints, effective background suppression is essential. Hit filtering algorithms based on Graph Neural Networks (GNNs), including the Interaction Network (IN), have demonstrated successful...
We present lightweight, attention-enhanced Graph Neural Networks (GNNs) tailored for real-time particle reconstruction and identification in LHCb’s next-generation calorimeter. Our architecture builds on node-centric GarNet layers, which eliminate costly edge message passing and are optimized for FPGA deployment, achieving sub-microsecond inference latency. By integrating attention mechanisms...
Transformers are the state-of-the-art model architectures and widely used in application areas of machine learning. However the performance of such architectures is less well explored in the ultra-low latency domains where deployment on FPGAs or ASICs is required. Such domains include the trigger and data acquisition systems of the LHC experiments.
We present a transformer-based algorithm...
One of the central tools in hadron spectroscopy is amplitude analysis (partial-wave analysis) to interpret the experimental data. Amplitude models are fitted to data with large statistics to extract information about resonances and branching fractions. In amplitude analysis, we require flexibility to implement models with different decay hypotheses, spin formalisms, and resonance...
The CMS Experiment at the CERN Large Hadron Collider (LHC) relies on a Level-1 Trigger system (L1T) to process in real time all potential collisions, happeing at a rate of 40 MHz, and select the most promising ones for data acquisition and further processing. The CMS upgrades for the upcoming high-luminosity LHC run will vastly improve the quality of the L1T event reconstruction, providing...
We present a versatile GNN-based end-to-end reconstruction algorithm for highly granular calorimeters that can include track and timing information to aid the reconstruction of particles. The algorithm starts directly from calorimeter hits and possibly reconstructed tracks, and outputs a coordinate transformation in which all shower objects are well separated from each other and assigned...
I will present work on Tropical sampling from Feynman measures:
We introduce an algorithm that samples a set of loop momenta distributed as a given Feynman integrand. The algorithm uses the tropical sampling method and can be applied to evaluate phase-space-type integrals efficiently. We provide an implementation, momtrop, and apply it to a series of relevant integrals from the...
With the upcoming High-Luminosity upgrades at the LHC, data generation rates are expected to increase significantly. This calls for highly efficient architectures for machine learning inference in experimental workflows like event reconstruction, simulation, and data analysis.
At the ML4EP team at CERN, we have developed SOFIE, a tool within the ROOT/TMVA package that translates externally...
The ATLAS experiment has engaged in a modernization of the reconstruction software to cope with the challenging running conditions expected for HL-LHC operations. The use of the experiment-independent ACTS toolkit for track reconstruction is a major component of this effort, involving the complete redesign of several elements of the ATLAS reconstruction software. This contribution will...
Particle physics experiments rely on the (generalised) likelihood ratio test (LRT) for searches and measurements. This is not guaranteed to be optimal for composite hypothesis tests, as the Neyman-Pearson lemma pertains only to simple hypothesis tests. An improvement in the core statistical testing methodology would have widespread ramifications across experiments. We discuss an alternate test...
For several decades, the FORM computer algebra system has been a crucial software package for the large-scale symbolic manipulations required by computations in theoretical high-energy physics. In this talk I will present version 5, which includes an updated built-in diagram generator, greatly improved polynomial arithmetic performance through an interface to FLINT, and enhanced capabilities...
CLUEstering is a versatile clustering library based on CLUE, a density-based weighted clustering algorithm optimized for high-performance computing that supports clustering in an arbitraty. The library offers a user-friendly Python interface and a C++ backend to maximize performance. CLUE’s parallel design is tailored to exploit modern hardware accelerators, enabling it to process large-scale...
We present the program package ${\tt ggxy}$, which in its first version can be used to calculate partonic and hadronic cross sections to Higgs boson pair production at NLO QCD. The 2-loop virtual amplitudes are implemented using analytical approximations in different kinematic regions, while all other parts of the calculation are exact. This implementation allows to freely modify the masses of...
Neural Simulation-Based Inference (NSBI) is a powerful class of machine learning (ML)-based methods for statistical inference that naturally handle high dimensional parameter estimation without the need to bin data into low-dimensional summary histograms. Such methods are promising for a range of measurements at the Large Hadron Collider, where no single observable may be optimal to scan over...
We present a modular, data-driven framework for calibration and performance correction in the ALICE experiment. The method addresses time- and parameter-dependent effects in high-occupancy heavy-ion environments, where evolving detector conditions (e.g., occupancy and cluster overlaps, gain drift, space charge, dynamic distortions, and reconstruction or calibration deficiencies) require...
Significant computing resources are used for parton-level event generation for the Large Hadron Collider (LHC). The resource requirements of this part of the simulation toolchain are expected to grow further in the High-Luminosity (HL-LHC) era. At the same time, the rapid deployment of computing hardware different from the traditional CPU+RAM model in data centers around the world mandates a...
The Julia programming language is considered a strong contender as a future language for high-energy physics (HEP) computing. However, transitioning to the Julia ecosystem will be a long process and interoperability between Julia and C++ is required. So far several successful attempts have been made to wrap HEP C++ packages for use in Julia. It is also important to explore the reverse...
Jiangmen Underground Neutrino Observatory (JUNO) is a next generation 20-kton liquid scintillator detector under construction in southern China. It is designed to determine neutrino mass ordering via the measurement of reactor neutrino oscillation, and also to study other physics topics including atmospheric neutrinos, supernova neutrinos and more. The detector's large mass and high...
Future colliders such as the High Luminosity Large Hadron Collider and Circular Electron Positron Collider will face enormous increase in dataset in the coming decades. Quantum and quantum-inspired algorithms may allow us to overcome some of such challenges. There is an important class of problems the so-called combinatorial optimization problems. They are non-deterministic polynomial time...
The increasing reliance on deep learning for high-energy physics applications demands efficient FPGA-based implementations. However, deploying complex neural networks on FPGAs is often constrained by limited hardware resources and prolonged synthesis times. Conventional monolithic implementations suffer from scalability bottlenecks, necessitating the adoption of modular and resource-aware...
The application of foundation models in high-energy physics has recently been proposed as a way to use large unlabeled datasets to efficiently train powerful task-specific models. The aim is to train a task-agnostic model on an existing large dataset such that the learned representation can later be utilized for subsequent downstream physics tasks.
The pretrained model can reduce the training...
Physics programs at future colliders cover a wide range of diverse topics and set high demands for precise event reconstruction. Recent analyses have stressed the importance of accurate jet clustering in events with low boost and high jet multiplicity. This contribution present how machine learning can be applied to jet clustering while taking desired properties such as infrared and collinear...
Machine learning model compression techniques—such as pruning and quantization—are becoming increasingly important to optimize model execution, especially for resource-constrained devices. However, these techniques are developed independently of each other, and while there exist libraries that aim to unify these methods under a single interface, none of them offer integration with hardware...
OmniJet-alpha, the first cross-task foundation model for particle physics, was first presented at ACAT 2024. In its base configuration, OmniJet-alpha is capable of transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging). Since its release, we have also shown that it can sucessfully transfer from CMS Open data to simulation, and even...
The process of neutrino model building using flavor symmetries requires a physicist to select a group, determine field content, assign representations, construct the Lagrangian, calculate the mass matrices matrix, and perform statistical fits of the resulting free parameters. This process is constrained by the physicist's time and their intuition regarding mathematically complex groups,...
Large backgrounds and detector aging impact the track finding in the Belle II central drift chamber, reducing both purity and efficiency in events. This necessitates the development of new track algorithms to mitigate detector performance degradation. Building on our previous success with an end-to-end multi-track reconstruction algorithm for the Belle II experiment at the SuperKEKB collider...
Fast and precise evaluations of scattering amplitudes even
in the case of precision calculations is essential for event generation
tools at the HL-LHC. We explore the scaling behavior of the achievable
precision of neural networks in this regression problem for multiple
architectures, including a Lorentz symmetry aware multilayer perceptron
and the L-GATr architecture. L-GATr is...
With plans for upgrading the detector in order to collect data at a luminosity up to 1.5×1034 cm-2s-2 being ironed out (Upgrade II - LHC Run5), the LHCb Collaboration has sought to implement new data taking solutions already starting from the upcoming LHC Run4 (2030-2033).
The first stage of the LHCb High Level Trigger (HLT1), currently implemented on GPUs and aiming at reducing the event...
Detailed event simulation at the LHC is taking a large fraction of computing budget. CMS developed an end-to-end ML based simulation that can speed up the time for production of analysis samples of several orders of magnitude with a limited loss of accuracy. As the CMS experiment is adopting a common analysis level format, the NANOAOD, for a larger number of analyses, such an event...
Fast machine learning (ML) inference is of great interest in the HEP community, especially in low-latency environments like triggering. Faster inference often unlocks the use of more complex ML models that improve physics performance, while also enhancing productivity and sustainability. Logic gate networks (LGNs) currently achieve some of the fastest inference times for standard image...
Modern ML-based taggers have become the gold standard at the LHC, outperforming classical algorithms. Beyond pure efficiency, we also seek controllable and interpretable algorithms. We explore how we can move beyond black-box performance and toward physically meaningful understanding of modern taggers. Using explainable AI methods, we can connect tagger outputs with well-known physics...
The Matrix Element Method (MEM) offers optimal statistical power for hypothesis testing in particle physics, but its application is hindered by the computationally intensive multi-dimensional integrals required to model detector effects. We present a novel approach that addresses this challenge by employing Transformers and generative machine learning (ML) models. Specifically, we utilize ML...
The rising computational demands of growing data rates and complex machine learning (ML) algorithms in large-scale scientific experiments have driven the adoption of the Services for Optimized Network Inference on Coprocessors (SONIC) framework. SONIC accelerates ML inference by offloading tasks to local or remote coprocessors, optimizing resource utilization. Its portability across diverse...
Stable operation of detectors, beams, and targets is crucial for reducing systematic errors and achieving high-precision measurements in accelerator-based experiments. Historically, this stability was achieved through extensive post-acquisition calibration and systematic data studies, as not all operational parameters could be precisely controlled in real time. However, recent advances in...
Measurements of neutral, oscillating mesons are a gateway to quantum mechanics and give access to the fundamental interactions of elementary particles. For example, precise measurements of $CP$ violation in neutral $B$ mesons can be taken in order to test the Standard Model of particle physics. These measurements require knowledge of the $B$-meson flavour at the time of its production, which...
AI for fundamental physics is now a burgeoning field, with numerous efforts pushing the boundaries of experimental and theoretical physics, as well as machine learning research itself. In this talk, I will introduce a recent innovative application of Natural Language Processing to the state-of-the-art precision calculations in high energy particle physics. Specifically, we use Transformers to...
ATLAS explores modern neural networks for a multi-dimensional calibration of its calorimeter signal defined by clusters of topologically connected cells (topo-clusters). The Bayesian neural network (BNN) approach yields a continuous and smooth calibration function, including uncertainties on the calibrated energy per topo-cluster. In this talk the performance of this BNN-derived calibration is...
We construct Lorentz-equivariant transformer and graph networks using the concept of local canonicalization. While many Lorentz-equivariant architectures use specialized layers, this approach allows to take any existing non-equivariant architecture and make it Lorentz-equivariant using transformations with equivariantly predicted local frames. In addition, data augmentation emerges as a...
Abstract:
Optimizing control systems in particle accelerators presents significant challenges, often requiring extensive manual effort and expert knowledge. Traditional tuning methods are time-consuming and may struggle to navigate the complexity of modern beamline architectures. To address these challenges, we introduce a simulation-based framework that leverages Reinforcement Learning (RL)...
Modern machine learning (ML) algorithms are sensitive to the specification of non-trainable parameters called hyperparameters (e.g., learning rate or weight decay). Without guiding principles, hyperparameter optimization is the computationally expensive process of sweeping over various model sizes and, at each, re-training the model over a grid of hyperparameter settings. However, recent...
Neural networks for LHC physics must be accurate, reliable, and well-controlled. This requires them to provide both precise predictions and reliable quantification of uncertainties - including those arising from the network itself or the training data. Bayesian networks or (repulsive) ensembles provide frameworks that enable learning systematic and statistical uncertainties. We investigate...
The ATLAS detector at CERN and its supporting infrastructure form a highly complex system. It covers numerous interdependent sub-systems and requires collaboration across a team of multi-disciplinary experts. The ATLAS Technical Coordination Expert System provides an interactive description of the technical infrastructure and enhances its understanding. It features tools to assess the impact...
One of the main goals of theoretical nuclear physics is to provide a first-principles description of the atomic nucleus, starting from interactions between nucleons (protons and neutrons). Although exciting progress has been made in recent years thanks to the development of many-body methods and nucleon-nucleon interactions derived from chiral effective field theory, performing accurate...
Deep generative models have become powerful tools for alleviating the computational burden of traditional Monte Carlo generators in producing high-dimensional synthetic data. However, validating these models remains challenging, especially in scientific domains requiring high precision, such as particle physics. Two-sample hypothesis testing offers a principled framework to address this task....
In high energy physics, most AI/ML efforts focus on improving the scientific process itself — modeling, classification, reconstruction, and simulation. In contrast, we explore how Large Language Models (LLMs) can accelerate access to the physics by assisting with the broader ecosystem of work that surrounds and enables scientific discovery. This includes understanding complex documentation,...
With the emergence of increasingly complex workflows and data rates, accelerators have gained importance within ALICE and the Worldwide LHC Computing Grid (WLCG). Consequently, support for GPUs was added to JAliEn, the ALICE Grid middleware, in a transparent manner to automatically use these resources when available -- without breaking existing mechanisms for payload isolation and...
Unsupervised anomaly detection has become a pivotal technique for model-independent searches for new physics at the LHC. In high-energy physics (HEP), anomaly detection is employed to identify rare, outlier events in collision data that deviate significantly from expected distributions. A promising approach is the application of generative machine learning models, which can efficiently detect...
With the move to HTTP/WebDAV and JSON Web Tokens as a standard protocol for transfers within the WLCG distributed storage network, a large amount of off-the-shelf technologies become viable for meeting the requirements of a Storage Element (SE). In this work, we explore the capabilities and performance of the OpenResty framework, which extends the nginx server with the LuaJIT scripting...
In the field of High Throughput Computing (HTC), the management and processing of large volumes of accounting data across different environments and use cases is a significant challenge. AUDITOR addresses this issue by providing a flexible framework for building accounting pipelines that can be adapted to a wide range of needs.
At its core, AUDITOR serves as a centralised storage solution for...
To address the urgent need for efficient data analysis platforms in the neutron scattering field, this report presents a cloud-based computing infrastructure solution based on the technical architecture of OpenStack and WebRTC. Based on this infrastructure, a deeply integrated system for data management and storage is constructed to provide researchers with a one-stop analysis platform that...
The new fully software-based trigger of the LHCb experiment operates at a 30 MHz data rate, opening a search window into previously unexplored regions of physics phase space. The BuSca (Buffer Scanner) project at LHCb acquires, reconstructs and analyzes data in real time, extending sensitivity to new lifetimes and mass ranges though the recently deployed Downstream tracking algorithm. BuSca...
The Jiangmen Underground Neutrino Observatory (JUNO) is a multipurpose neutrino experiment with the primary goals of the determining the neutrino mass ordering and precisely measuring oscillation parameters. The JUNO detector construction was completed at the end of 2024. It generate about 3 petabytes of data annually, requiring extensive offline processing. This processing, which is called...
Efficient data processing using machine learning relies on heterogeneous computing approaches, but optimizing input and output data movements remains a challenge. In GPU-based workflows data already resides on GPU memory, but machine learning models requires the input and output data to be provided in specific tensor format, often requiring unnecessary copying outside of the GPU device and...
Charge particle track reconstruction is the foundation of the collider experiments. Yet, it's also the most computationally expensive part of the particle reconstruction. The innovation in tracking reconstruction with graph neural networks (GNNs) has shown the promising capability to cope with the computing challenges posed by the High-Luminosity LHC (HL-LHC) with Machine learning. However,...
The TrackML dataset, a benchmark for particle tracking algorithms in High-Energy Physics (HEP), presents challenges in data handling due to its large size and complex structure. In this study, we explore using a heterogeneous graph structure combined with the Hierarchical Data Format version 5 (HDF5) not only to efficiently store and retrieve TrackML data but also to speed up the training and...
We present a suite of optimizations to the Particle Transformer (ParT), a state-of-the-art model for jet tagging, targeting the stringent latency and memory constraints of real-time environments such as HL-LHC triggers. To address the quadratic scaling and compute bottlenecks of standard attention, we integrate FlashAttention for exact, fused-kernel attention with reduced memory I/O, and...
Weakly supervised anomaly detection has been shown to be a sensitive and robust tool for Large Hadron Collider (LHC) analysis. The effectiveness of these methods relies heavily on the input features of the classifier, influencing both model coverage and the detection of low signal cross sections. In this talk, we demonstrate that improvements in both areas can be achieved by using energy flow...
The Production and Distributed Analysis (PanDA) workload management system was designed with flexibility to adapt to emerging computing technologies in processing, storage, networking, and distributed computing middleware for the global data distribution. PanDA can coordinate processing over heterogeneous computing resources, including dozens of geographically separated high-performance...
This study evaluates the portability, performance, and adaptability of the Liquid Argon TPC (LAr TPC) detector simulations on different HPC platforms, specifically Polaris, Frontier, and Perlmutter. Lar TCP workflow is a computationally complex workflow which mimics neutrino interactions and the resultant detector responses in a modular liquid argon TPC, integrating various subsystems to...
The Next Generation Trigger project aims to improve the computational efficiency of the CMS reconstruction software (CMSSW) to increase the data processing throughput at the High-Luminosity Large Hadron Collider. As part of this project, this work focuses on improving the common Structure of Arrays (SoA) used in CMSSW for running both on CPUs and GPUs. We introduce a new SoA feature that...
Particle tracking is among the most sophisticated and complex parts of the full event reconstruction chain. Various reconstruction algorithms work in sequence to build trajectories from detector hits. Each of these algorithms requires numerous configuration parameters that need fine-tuning to properly account for the detector/experimental setup, the available CPU budget, and the desired...
Measurements and observations in Particle Physics fundamentally depend on one's ability to quantify their uncertainty and, thereby, their significance. Therefore, as Machine Learning methods become more prevalent in HEP, being able to determine the uncertainties of an ML method becomes more important. A wide range of possible approaches has been proposed, however, there has not been a...
In the end-cap region of the SPD detector complex, particle identification will be provided by a Focusing Aerogel RICH detector (FARICH). FARICH will primarily aid with pion / kaon separation in final open charmonia states (momenta below 5 GeV/c). A free-running (triggerless) data acquisition pipeline to be employed in the SPD results in a high data rate necessitating new approaches to event...
I will present joint work on the behavior of Feynman integrals and perturbative expansions at large loop orders. Using the tropical sampling algorithm for evaluating Feynman integrals, along with a dedicated graph-sampling algorithm to generate representative sets of Feynman diagrams, we computed approximately $10^7$ integrals with up to 17 loops in four-dimensional $\phi^4$ theory. Through...
The performance of Particle Identification (PID) in the LHCb experiment is critical for numerous physics analyses. Classifiers, derived from detector likelihoods under various particle mass hypotheses, are trained to tag particles using calibration samples that involve information from the Ring Imaging Cherenkov (RICH) detectors, calorimeters, and muon identification chambers. However, these...
Simulating physics processes and detector responses is essential in high energy physics but accounts for significant computing costs. Generative machine learning has been demonstrated to be potentially powerful in accelerating simulations, outperforming traditional fast simulation methods. While efforts have focused primarily on calorimeters initial studies have also been performed on silicon...
Reconstructing the trajectories of charged particles as they traverse several detector layers is a key ingredient for event reconstruction at LHC and virtually any particle physics experiment. The limited bandwidth available, together with the high rate of tracks per second O(10^10) - where each track consists of a variable number of measurements - makes this problem exceptionally challenging...
We present a novel integration of the PanDA workload management system (PanDA WMS) and Harvester with Globus Compute to enable secure, portable, and remote execution of ATLAS workflows on high-performance computing (HPC) systems. In our approach, Harvester, which runs on an external server, is used to orchestrate job submissions via Globus Compute’s multi-user endpoint (MEP). This MEP provides...
Despite compelling evidence for the incompleteness of the Standard Model and an extensive search programme, no hints of new physics have so far been observed at the LHC. Anomaly detection was proposed as way to enhance the sensitivity of generic searches not targetting any specific signal model. One of the leading methods in this field, CATHODE (Classifying Anomalies THrough Outer Density...
The new fully software-based trigger of the LHCb experiment operates at a 30 MHz data rate and imposes tight constraints on GPU execution time. Tracking reconstruction algorithms in this first-level trigger must efficiently select detector hits, group them, build tracklets, account for the LHCb magnetic field, extrapolate and fit trajectories, and select the best track candidates to filter...
Recent advancements in large language models (LLMs) have paved the way for tools that can enhance the software development process for scientists. In this context, LLMs excel at two tasks -- code documentation in natural language and code generation in a given programming language. The commercially available tools are often restricted by the available context window size, encounter usage...
The data processing and analyzing is one of the main challenges at HEP experiments. To accelerate the physics analysis and drive new physics discovery, the rapidly developing Large Language Model (LLM) is the most promising approach, it have demonstrated astonishing capabilities in recognition and generation of text while most parts of physics analysis can be benefitted. In this talk we will...
In view of reducing the disk size of the future analysis formats for High Luminosity LHC in the ATLAS experiment, we have explored the use of lossy compression in the newly developed analysis format known as PHYSLITE. Improvements in disk size are being obtained in migrating from the 'traditional' ROOT TTree format to the newly developed RNTuple format. Lossy compression can bring improvements...
This contribution discusses an anomaly detection search for narrow-width resonances beyond the Standard Model that decay into a pair of jets. Using 139 fb−1 of proton-proton collision data at sqrt(s) = 13 TeV, recorded from 2015 to 2018 with the ATLAS detector at the Large Hadron Collider, we aim to identify new physics without relying on a specific signal model. The analysis employs two...
The ServiceX project aims to provide a data extraction and delivery service for HEP analysis data, accessing files from distributed stores and applying user-configured transformations on them. ServiceX aims to support many existing analysis workflows and tools in as transparent a manner as possible, while enabling new technologies. We will discuss the most recent backends added to ServiceX,...
To compare collider experiments, measured data must be corrected for detector distortions through a process known as unfolding. As measurements become more sophisticated, the need for higher-dimensional unfolding increases, but traditional techniques have limitations. To address this, machine learning-based unfolding methods were recently introduced. In this work, we introduce OmniFold-HI, an...
When we try to move the software named ‘Hepsptycho’, which is a ptychography reconstruction program originally based on multiple Nvidia GPU and MPI techs, to run on the Hygon DCU architectures, we found that the reconstructed object and probe encountered an error while the results running on Nvidia GPUs are correct. We profiled the ePIE algorithm using NVIDIA Nsight Systems and Hygon's...
Workflow tools provide the means to codify complex multi-step processes, thus enabling reproducibility, preservation, and reinterpretation efforts. Their powerful bookkeeping also directly supports the research process, especially where intermediate results are produced, inspected, and iterated upon frequently.
In Luigi, such a complex workflow graph is composed of individual tasks that...
In this presentation, I will discuss recent advancements in NNLO+PS predictions for top-quark pair production and decay within the MiNNLO framework. MiNNLO provides a robust method for incorporating next-to-next-to-leading order (NNLO) QCD corrections directly into fully differential predictions, offering unprecedented accuracy. This approach enables a consistent treatment of both production...
Simulating showers of particles in highly-granular detectors is a key frontier in the application of machine learning to particle physics. Achieving high accuracy and speed with generative machine learning models can enable them to augment traditional simulations and alleviate a major computing constraint.
Recent developments have shown how diffusion based generative shower simulation...
Artificial Intelligence is set to play a transformative role in designing large and complex detectors, such as the ePIC detector at the upcoming Electron-Ion Collider (EIC). The ePIC setup features a central detector and additional systems positioned in the far forward and far backward regions. Designing this system involves balancing many factors—performance, physics goals, and cost—while...
The $Z_c(3900)$ was first discovered by the Beijing Spectrometer (BESIII) detector in 2013. As one of the most attractive discoveries of the BESIII experiment, $Z_c(3900)$ itself has inspired extensive theoretical and experimental research on its properties. In recent years, the rapid growth of massive experimental data at high energy physics (HEP) experiments have driven a lot of novel...
One of the main points of object reconstruction is the definition of the targets of our reconstruction. In this talk we present recent developments on the topic, focusing on how we can embed detector constraints, mainly calorimeter granularity, in our truth information and how this can impact the performance of the reconstruction, in particular for Machine Learning based approaches. We will...
To standardize the evaluation of computational capabilities across various hardware architectures in data centers, we developed a CPU performance benchmarking tool within the HEP-Score framework. The tool uses the JUNO offline software as a realistic workload and produces standardized outputs aligned with HEP-Score requirements. Our tests demonstrate strong linear performance characteristics...
Hyperparameter optimization plays a crucial role in achieving high performance and robustness for machine learning models, such those used in complex classification tasks in High Energy Physics (HEP).
In this study, we investigate and experience the usage of $\texttt{Optuna}$, a rather new, modern and scalable optimization tool in the framework of a realistic signal-versus-background...
In large-scale distributed computing systems, workload dispatching and the associated data management are critical factors that determine key metrics such as resource utilization of distributed computing and resilience of scientific workflows. As the Large Hadron Collider (LHC) advances into its high luminosity era, the ATLAS distributed computing infrastructure must improve these metrics to...
The LHCb collaboration is currently using a pioneer system of data filtering in the trigger system, based on real-time particle reconstruction using Graphics Processing Units (GPUs). This corresponds to processing 5 TB/s of data and has required a huge amount of hardware and software developments. Among them, the corresponding power consumption and sustainability is an imperative matter in...
The increasing reliance on machine learning (ML) and particularly deep learning (DL) in scientific and industrial applications requires models that are not only accurate, but also reliable under varying conditions. This is especially important for automated machine learning and fault-tolerant systems where there is limited or no human control. In this paper, we present a novel,...
As the High-Luminosity LHC (HL-LHC) era approaches, significant improvements in
reconstruction software are required to keep pace with the increased data rates and
detector complexity. A persistent challenge for high-throughput event reconstruction is
the estimation of track parameters, which is traditionally performed using iterative
Kalman Filter-based algorithms. While GPU-based track...
ServiceX, a data extraction and delivery service for HEP experiments, is being used in ATLAS to prepare training data for a long-lived particle search. The training data contains low-level features not available in the ATLAS experiment’s PHYSLITE format - making ServiceX’s ability to read complex event data (e.g. ATLAS’s xAOD format) ideally suited to solving this problem. This poster will...
The speed and fidelity of detector simulations in particle physics pose compelling questions on future LHC analysis and colliders. The sparse high-dimensional data combined with the required precision provide a challenging task for modern generative networks. We present a general framework to train generative networks on any detector geometry with minimal user input. Vision transformers allow...
The ROOT software framework is widely used in HEP for storage, processing, analysis and visualization of large datasets. With the large increase in usage of ML for experiment workflows, especially lately in the last steps of the analysis pipeline, the matter of exposing ROOT data ergonomically to ML models becomes ever more pressing. In this contribution we discuss the experimental component...
Machine learning methods enable unbinned and full-dimensional unfolding. However, existing approaches, both classifier-based and generative, suffer from prior dependence. We propose a new method for ML-based unfolding that is completely prior independent and infers the unfolded distribution in a fully frequentist manner. Using several benchmark datasets, we demonstrate that the method can...
For over two decades, computing resources in the Worldwide LHC Computing Grid (WLCG) have been based exclusively on the x86 architecture. However, in the near future, heterogeneous non-x86 architectures are expected to make up a significant portion of the resources available to LHC experiments driven also by their adoption in current and upcoming world-class HPC facilities. In response to this...
Significant efforts are currently underway to improve the description of hadronization using Machine Learning. While modern generative architectures can undoubtedly emulate observations, it remains a key challenge to integrate these networks within principled fragmentation models in a consistent manner. This talk presents developments in the HOMER method for extracting Lund fragmentation...
Charged track reconstruction is a critical task in nuclear physics experiments, enabling the identification and analysis of particles produced in high-energy collisions. Machine learning (ML) has emerged as a powerful tool for this purpose, addressing the challenges posed by complex detector geometries, high event multiplicities, and noisy data. Traditional methods rely on pattern recognition...
We present a quantum generative model that extends Quantum Born Machines (QBMs) by incorporating a parametric Polynomial Chaos Expansion (PCE) to encode classical data distributions. Unlike standard QBMs relying on fixed heuristic data-loading strategies, our approach employs a trainable Hermite polynomial basis to amplitude-encode classical data into quantum states. These states are...
The Compton Spectrometer and Imager (COSI) is a NASA Small Explorer (SMEX) satellite mission planned to fly in 2027. It has the participation of institutions in the US, Europe and Asia and aims at the construction of a gamma-ray telescope for observations in the 0.2-5 MeV energy range. COSI consists of an array of germanium strip detectors cooled to cryogenic temperatures with millimeter...
Flexible workload specification and management are critical to the success of the CMS experiment, which utilizes approximately half a million cores across a global grid computing infrastructure for data reprocessing and Monte Carlo production. TaskChain and StepChain specifications, responsible for over 95% of central production activities, employ distinct workflow paradigms: TaskChain...
We apply for the first time the Flow Matching method to the problem of phase-space sampling for event generation in high-energy collider physics. By training the model to remap the random numbers used to generate the momenta and helicities of the collision matrix elements as implemented in the portable partonic event generator Pepper, we find substantial efficiency improvements in the studied...
Beyond the planet Neptune, only the largest solar system objects can be observed directly. However, there are tens of thousands of smaller objects whose frequency and distribution could provide valuable insights into the formation of our solar system - if we could see them.
Project SOWA (Solar-system Occultation Watch and Analysis) aims to systematically search for such invisible objects...
As the HL-LHC prepares to deliver large volumes of data, the need for an efficient data delivery and transformation service becomes crucial. To address this challenge, a cross-experiment toolset—ServiceX—was developed to link the centrally produced datasets to flexible, user-level analysis workflows. Modern analysis tools such as Coffea benefit from ServiceX as the first step in event...
The next generation of ground-based gamma-ray astronomy instruments will involve arrays of dozens of telescopes, leading to an increase in operational and analytical complexity. This scale-up poses challenges for both system operations and offline data processing, especially when conventional approaches struggle to scale effectively. To address these challenges, we are developing AI agents...
The High Energy Photon Source (HEPS), a new fourth-generation high-energy synchrotron radiation facility, is set to become fully operational by the end of 2025. With its significantly enhanced brightness and detector performance, HEPS will generate over 300 PB of experimental data annually across 14 beamlines in phase I, quickly reaching the EB scale. HEPS supports a wide range of experimental...
MadEvent7 is a new modular phase-space generation library written in C++ and CUDA, running on both GPUs and CPUs. It features a variety of different phase-space mappings, including the classic MadGraph multi-channel phase space and an optimized implementation of normalizing flows for neural importance sampling, as well as their corresponding inverse mappings. The full functionality is...
In the last decade, the concept of Open Science has gained importance: there is a real effort to make tools and data shareable among different communities, with the goal of making data and software FAIR (Findable, Accessible, Interoperable and Reusable). This goal is shared by several scientific communities, including the Einstein Telescope (ET). ET is the third generation ground-based...
Modern approaches to phase-space integration combine well-established Monte Carlo methods with machine learning techniques for importance sampling. Recent progress in generative models in the form of continuous normalizing flows, trained using conditional flow matching, offers the potential to improve the phase-space sampling efficiency significantly.
We present a multi-jet inclusive...
In many domains of science the likelihood function is a fundamental ingredient used to statistically infer model parameters from data, due to the likelihood ratio (LR) as an optimal test statistic. Neural based LR estimation using probabilistic classification has therefore had a significant impact in these domains, providing a scalable method for determining an intractable LR from simulated...
The Jiangmen Underground Neutrino Observatory (JUNO) is an underground 20 kton liquid scintillator detector being constructed in southern China. The JUNO physics program aims to explore neutrino properties, particularly through electron anti-neutrinos emitted from two nuclear power complexes at a baseline of approximately 53 km. Targeting an unprecedented relative energy resolution of 3% at 1...
One primary goal of the LHC is the search for physics beyond the Standard Model, leading to the development of many different methods to look for new physics effects. In this context, we employ Machine Learning methods, in particular we explore the applications of Simulation-Based Inference (SBI), to learn otherwise intractable likelihoods and fully exploit the information available, compared...
Neural Simulation-Based Inference (NSBI) is an emerging class of statistical methods that harness the power of modern deep learning to perform inference directly from high-dimensional data. These techniques have already demonstrated significant sensitivity gains in precision measurements across several domains, outperforming traditional approaches that rely on low-dimensional summaries. This...
We discuss recent developments in performance improvements for Monte Carlo integration and event sampling. (1) Massive parallelization of matrix element evaluations based on a new back end for the matrix element generator O'Mega targeting GPUs. This has already been integrated in a development version of the Monte Carlo event generator Whizard for realistic testing and profiling. (2) A...
The LHCb experiment, one of the four major experiments at the Large Hadron Collider (LHC), excels in high-precision measurements of particles that are produced relatively frequently (strange, charmed and bottom hadrons). Key to LHCb's potential is its sophisticated trigger system that enables complete event reconstruction, selection, alignment and calibration in real-time. Through the Turbo...
To characterize the structures and properties of samples in the analysis of experimental data of Small-Angle Neutron Scattering (SANS), a physical model must be selected corresponding to each sample for iterative fitting. However, the conventional method of model selection is primarily based on manual experience, which has a high threshold for analysis and low accuracy. Furthermore, the...
In this article, we present the High-Performance Output (HiPO) data format developed at Jefferson Laboratory for
storing and analyzing data from Nuclear Physics experiments. The format was designed to efficiently store large
amounts of experimental data, utilizing modern fast compression algorithms. The purpose of this development was
to provide organized data in the output, facilitating...
The LHCb experiment at the Large Hadron Collider (LHC) operates a fully software-based trigger system that processes proton-proton collisions at a rate of 30 MHz, reconstructing both charged and neutral particles in real time. The first stage of this trigger system, running on approximately 500 GPU cards, performs a track pattern recognition to reconstruct particle trajectories with low...
High-energy physics (HEP) analyses routinely handle massive datasets, often exceeding the available resources. Efficiently interacting with these datasets requires dedicated techniques for data loading and management. Awkward Array is a Python library widely used in high-energy physics (HEP) to efficiently handle complex, irregularly structured...
The upgraded LHCb experiment is pioneering the landscape of real-time data-processing techniques using an heterogeneous computing infrastructure, composed of both GPUs and FPGAs, aimed at boosting the performance of the HLT1 reconstruction. Amongst the novelties in the reconstruction infrastructure made for the Run 3, the introduction of a real-time VELO hit-finding FPGA-based architecture...
Foundation models are a very successful approach to linguistic tasks. Naturally, there is the desire to develop foundation models for physics data. Currently, existing networks are much smaller than publicly available Large Language Models (LLMs), the latter having typically billions of parameters. By applying pretrained LLMs in an unconventional way, we introduce large networks for...
Based on previous experience with parallel event data processing, a Structure-of-Arrays (SoA) layout for objects frequently performs better than Array-of-Structures (AoS), especially on GPUs. However, AoS is widespread in existing code, and in C++, changing the data layout from AoS to SoA requires changing the data structure declarations and the access syntax. This work is repetitive, time...
The increasing complexity of modern neural network architectures demands fast and memory-efficient implementations to mitigate computational bottlenecks. In this work, we evaluate the recently proposed BitNet architecture in HEP applications, assessing its performance in classification, regression, and generative modeling tasks. Specifically, we investigate its suitability for quark-gluon...
Charged particle track reconstruction is one the heaviest computational tasks in the event reconstruction chain at Large Hadron Collider (LHC) experiments. Furthermore, projections for the High Luminosity LHC (HL-LHC) show that the required computing resources for single-threaded CPU algorithms will exceed those that are expected to be available. It follows that experiments at the HL-LHC will...
The exponential time scaling of traditional primary vertex reconstruction algorithms raises significant performance concerns for future high-pileup environments, particularly with the upcoming High Luminosity upgrade to the Large Hadron Collider. In this talk, we introduce PV-Finder, a deep learning-based approach that leverages reconstructed track parameters to directly predict primary vertex...
We present FeynGraph, a modern high-performance Feynman diagram generator designed to integrate seamlessly with modern computational workflows to calculate scattering amplitudes. FeynGraph is designed as a high-performance Rust library with easy-to-use Python bindings, allowing it to be readily used in other tools. With additional features like arbitrary custom diagram selection filters and...
RNTuple is ROOT's next-generation columnar format, replacing TTree as the primary storage solution for LHC event data.
After 6 years of R&D, the RNTuple format reached the 1.0 milestone in 2024, promising full backward compatibility with any data written from that moment on.
Designing a new on-disk format does not only allow for significant improvements on file sizes and read/write speed,...
For high-energy physics experiments, the generation of Monte Carlo events, and particularly the simulation of the detector response, is a very computationally intensive process. In many cases, the primary bottleneck in detector simulation is the detailed simulation of the electromagnetic and hadronic showers in the calorimeter system.
ATLAS is currently using its state-of-the-art fast...
Unfolding detector-level data into meaningful particle-level distributions remains a key challenge in collider physics, especially as the dimensionality of the relevant observables increases. Traditional unfolding techniques often struggle with such high-dimensional problems, motivating the development of machine learning-based approaches.We introduce a new method for generative unfolding that...
Non perturbative QED is used in calculations of Schwinger pair creation, in precision QED tests with ultra-intense lasers and to predict beam backgrounds at the interaction point of colliders. In order to predict these phenomena, custom built monte carlo event generators based on a suitable non perturbative theory have to be developed. One such suitable theory uses the Furry Interaction...
The accurate simulation of particle showers in collider detectors remains a critical bottleneck for high-energy physics research. Current approaches face fundamental limitations in scalability when modeling the complete shower development process.
Deep generative models offer a promising alternative, potentially reducing simulation costs by orders of magnitude. This capability becomes...
Two shortcomings of classical unfolding algorithms, namely that they are defined on binned, one-dimensional observables, can be overcome when using generative machine learning. Many studies on generative unfolding reduce the problem to correcting for detector smearing, however a full unfolding pipeline must also account for background, acceptance and efficiency effects. To fully integrate...
The determination of the hot QCD pressure has a long history, and has -- due to its phenomenological relevance in cosmology, astrophysics and heavy-ion collisions -- spawned a number of important theoretical advances in perturbative thermal field theory applicable to equilibrium thermodynamics.
We present major progress towards the determination of the last missing piece for the pressure of...
Measured distributions are usually distorted by a finite resolution of the detector. Within physics research, the necessary correction of these distortions is know as Unfolding. Machine learning research uses a different term for this very task: Quantification Learning. For the past two decades, this difference in terminology - together with several differences in notation - have prevented...
Next-generation High Energy Physics (HEP) experiments face unprecedented computational demands. The High-Luminosity Large Hadron Collider anticipates data processing needs that will exceed available resources, while intensity frontier experiments such as DUNE and LZ are dominated by the simulation of high-multiplicity optical photon events. Heterogeneous architectures, particularly GPU...
Random matrix theory has a long history of applications in the study of eigenvalue distributions arising in diverse real-world ensembles of matrix data. Matrix models also play a central role in theoretical particle physics, providing tractable mathematical models of gauge-string duality, and allowing the computation of correlators of invariant observables in physically interesting sectors of...
The simulation throughput of LHC experiments is increasingly limited by detector complexity in the high-luminosity phase. As high-performance computing shifts toward heterogeneous architectures such as GPUs, accelerating Geant4 particle transport simulations by offloading parts of the workload to GPUs can improve performance. The AdePT plugin currently offloads electromagnetic showers in...
Gravitational Wave (GW) Physics has entered a new era of Multi-Messenger Astronomy (MMA), characterized by increasing detections from GW observatories such as the LIGO, Virgo, and KAGRA collaborations. This presentation will introduce the KAGRA experiment, outlining the current workflow from data collection to physics interpretation, and demonstrate the transformative role of machine learning...
The High-Luminosity LHC era will deliver unprecedented data volumes, enabling measurements on fine-grained multidimensional histograms containing millions of bins with thousands of events each. Achieving ultimate precision requires modeling thousands of systematic uncertainty sources, creating computational challenges for likelihood minimization and parameter extraction. Fast minimization is...