Simulation plays an essential role in modern high energy physics experiments. However, the simulation of particle showers in the calorimeter systems of detectors with traditional Monte Carlo procedures represents a major computational bottleneck, and this subdetector system has long been the focus of fast simulation efforts. More recently, approaches based on deep generative models have shown...
Graph Neural Network (GNN) has been demonstrated to be a promising technique for particle track reconstruction as it provides better scaling compared to the traditional combinatorial algorithm. Most of the GNN tracking methods can be classified to edge classification and object condensation. A common problem with these approaches is that it does not handle the situation where a spacepoint is...
The High Energy cosmic-Radiation Detection (HERD) facility is a space astronomy and particle astrophysics experiment planned to be installed on the China Space Station. HERD is a China-led mission with Italy leading key European contributions. Its primary scientific goals include detecting dark matter in cosmic space, precisely measuring the energy spectrum and composition of cosmic rays, and...
With the emergence of increasingly complex workflows and data rates, accelerators have gained importance within ALICE and the Worldwide LHC Computing Grid (WLCG). Consequently, support for GPUs was added to JAliEn, the ALICE Grid middleware, in a transparent manner to automatically use these resources when available -- without breaking existing mechanisms for payload isolation and...
Modern beam telescopes play a crucial role in high-energy physics experiments to precisely track particle interactions. Accurate alignment of detector elements in real-time is essential to maintain the integrity of reconstructed particle trajectories, especially in high-rate environments like the ATLAS experiment at the Large Hadron Collider (LHC). Any misalignment in the detector geometry can...
Normalized Transformer architectures have shown significant improvements in training efficiency across large-scale natural language processing tasks. Motivated by these results, we explore the application of normalization techniques to Particle Transformer (ParT) for jet classification in high-energy physics. We construct a normalized vanilla Transformer classifier and a normalized ParT...
The simulation throughput of LHC experiments is increasingly limited by detector complexity in the high-luminosity phase. As high-performance computing shifts toward heterogeneous architectures such as GPUs, accelerating Geant4 particle transport simulations by offloading parts of the workload to GPUs can improve performance. The AdePT plugin currently offloads electromagnetic showers in...
Awkward Array provides efficient handling of large, irregular data structures in Python, playing a key role in high-energy physics analysis. This work presents ongoing efforts to optimize Awkward Arrays for GPUs using CUDA, aiming to achieve performance parity with or surpass CPU kernel implementations. Key improvements focus on optimized memory management, leveraging CUDA-specific features,...
Unsupervised anomaly detection has become a pivotal technique for model-independent searches for new physics at the LHC. In high-energy physics (HEP), anomaly detection is employed to identify rare, outlier events in collision data that deviate significantly from expected distributions. A promising approach is the application of generative machine learning models, which can efficiently detect...
The design of modern high-energy physics detectors is a highly intricate process, aiming to maximize their physics potential while balancing various manufacturing constraints. As detectors become larger and more sophisticated, it becomes increasingly difficult to maintain a comprehensive understanding of the entire system. To address this challenge, we aim to translate the design process into...
With the move to HTTP/WebDAV and JSON Web Tokens as a standard protocol for transfers within the WLCG distributed storage network, a large amount of off-the-shelf technologies become viable for meeting the requirements of a Storage Element (SE). In this work, we explore the capabilities and performance of the OpenResty framework, which extends the nginx server with the LuaJIT scripting...
The CMS experiment at the CERN Large Hadron Collider (LHC) is preparing to upgrade its muon detection system by installing new chambers based on Gas Electron Multiplier (GEM) technology, a type of Micro-Pattern Gas Detector (MPGD). The design of an MPGD varies based on experimental requirements, but studying each detector parameter in situ can be costly and impractical. Additionally, the large...
The CMS experiment at the CERN Large Hadron Collider (LHC) is preparing to upgrade its muon detection system by installing new chambers based on Gas Electron Multiplier (GEM) technology, a type of Micro-Pattern Gas Detector (MPGD). The design of an MPGD varies based on experimental requirements, but studying each detector parameter in situ can be costly and impractical. Additionally, the large...
Data analysts working with large datasets require absolute certainty that each file is processed exactly once. ServiceX addresses this challenge by using well established transaction processing architectures. This system implements a fully transactional workflow powered by PostgreSQL and RabbitMQ, ensuring data integrity throughout the processing pipeline. This presentation details both the...
In the field of High Throughput Computing (HTC), the management and processing of large volumes of accounting data across different environments and use cases is a significant challenge. AUDITOR addresses this issue by providing a flexible framework for building accounting pipelines that can be adapted to a wide range of needs.
At its core, AUDITOR serves as a centralised storage solution for...
The Large Hadron Collider (LHC) generates vast amounts of data, making efficient data quality monitoring (DQM) essential for ensuring reliable detector performance and accurate physics analysis. AutoDQM, an anomaly detection system, was developed to address this challenge. It applies statistical techniques such as beta-binomial probability tests alongside principal component analysis and...
The Large Hadron Collider (LHC) generates vast amounts of data, making efficient data quality monitoring (DQM) essential for ensuring reliable detector performance and accurate physics analysis. AutoDQM, an anomaly detection system, was developed to address this challenge. It applies statistical techniques such as beta-binomial probability tests alongside principal component analysis and...
Many physics analyses using the CMS detector at the LHC require accurate, high resolution electron and photon energy measurements. The CMS electromagnetic calorimeter (ECAL) is a fundamental component of these analyses. The excellent resolution of ECAL was of central importance to the discovery of the Higgs boson in 2012, and is being used for increasingly precise measurements of Higgs boson...
Workflow Management Systems (WMSs) are essential tools for structuring an arbitrary sequence of tasks in a clear, maintainable, and repeatable way. The popular Python-based WMS luigi helps building complex workflows. It handles task dependency resolution and input/output tracking as well providing a simple workflow visualisation and a convenient command-line integration.
The extension...
Research has become dependent on processing power and storage, one crucial aspect being data sharing. The Open Science Data Federation (OSDF) project aims to create a scientific global data distribution network based on the Pelican Platform. OSDF does not develop new software but relies on the XrootD and Pelican projects. Nevertheless, OSDF must understand the XrootD limits under various...
At INAF (Istituto nazionale di astrofisica), in the contest of AGILE mission (Astro-Rivelatore Gamma a Immagini Leggero), we developed PacketLib, an open-source C++ software library designed for building applications that handle satellite telemetry source packets, provided they comply with the CCSDS Telemetry and Telecommand Standards.
As part of the ASTRI (Astrofisica con Specchi a...
The CMS Experiment had to manage and process data volumes approaching the exascale during the LHC Run 3. This required a seamless synergy between the workload and data management systems, namely WMCore and Rucio. Following up to the integration of Rucio into the CMS infrastructure, the workload management system has undergone substantial adaptations to harness new data management capabilities...
To address the urgent need for efficient data analysis platforms in the neutron scattering field, this report presents a cloud-based computing infrastructure solution based on the technical architecture of OpenStack and WebRTC. Based on this infrastructure, a deeply integrated system for data management and storage is constructed to provide researchers with a one-stop analysis platform that...
The new fully software-based trigger of the LHCb experiment operates at a 30 MHz data rate, opening a search window into previously unexplored regions of physics phase space. The BuSca (Buffer Scanner) project at LHCb acquires, reconstructs and analyzes data in real time, extending sensitivity to new lifetimes and mass ranges though the recently deployed Downstream tracking algorithm. BuSca...
This contribution presents the final iteration of the CaloClouds series. Simulation of photon showers in the granularities expected in a future Higgs factory is computationally challenging. A viable simulation must capture the find details exposed by such a detector, while also being fast enough to keep pace with the expected rate of observations. The Caloclouds model utilises point cloud...
Charged particle tracking for drift chamber is a task in high-energy physics. In this work, we propose using reinforcement learning (RL) to the reconstruction of particle trajectories in drift chambers. By framing the tracking problem as a decision-making process, RL enables the development of more efficient and adaptive tracking algorithms. This approach offering improved performance and...
The CMS Tracker in Run 3 consists of thousands of silicon modules (Pixel: 1856 modules, Strip: 15148 modules). Given the detector's aging and potential operational incidents, constant monitoring of its components is essential to ensure the highest data quality. To achieve this, the CMS Tracker group employs comprehensive Data Quality Monitoring (DQM) and Data Certification (DC) procedures and...
The Super Tau Charm Facility (STCF) is a next-generation electron-positron collider proposed in China, operating at a center-of-mass energy of 2–7 GeV with a peak luminosity of 0.5×10³⁵ cm⁻²s⁻¹. In STCF experiments, the identification of high-momentum charged hadrons is critical for physics studies, driving the implementation of a dedicated particle identification (PID) system that combines...
Particle physics is a field hungry for high quality simulation, to match the precision with which data is gathered at collider experiments such as the Large Hadron Collider (LHC). The computational demands of full detector simulation often lead to the use of faster but less realistic parameterizations, potentially compromising the sensitivity, generalizability, and robustness of downstream...
Abstract: Proton computed tomography (pCT) is poised to advance precise dose planning in hadron therapy, an innovative cancer treatment that uses protons and heavy ions to deliver targeted radiation. By harnessing the Bragg peak effect, hadron therapy can concentrate radiation on tumors while minimizing exposure to surrounding healthy tissues. Achieving high-resolution pCT images, however,...
The CMS experiment hosted at CERN relies every day on the computing resources provided by the Worldwide LHC Computing Grid consortium (WLCG) to process data and produce Monte-Carlo simulated event samples. In such a context, utilizing the heavily distributed system of computing resources granted by the WLCG to process non-local data represents a challenging task. In addition, the CMS Workflow...
Over the past several months, we have deployed a power accounting system across our heterogeneous WLCG Tier2 compute clusters at ScotGrid Glasgow, integrating real-time metrics from Prometheus with static power characteristics measured on our hardware. This framework dynamically allocates energy consumption to Virtual Organizations (VOs) based on actual core usage, while distinguishing the...
The production of a sufficiently large number of simulated Monte Carlo events is expected to be a major computational bottleneck for future high-energy physics (HEP) experiments. In particular, the simulation of the calorimeter response is among the most resource-intensive tasks. While the HEP community has made significant progress in developing promising generative fast simulation models,...
Particle identification (PID) plays a crucial role in particle physics experiments. A groundbreaking advancement in PID involves cluster counting (dN/dx), which measures primary ionizations along a particle’s trajectory within a pixelated time projection chamber (TPC), as opposed to conventional dE/dx measurements. A pixelated TPC with a pixel size of 0.5x0.5 mm2 has been proposed as the...
The CONDOR Observatory [1] is designed to detect extensive air showers (EAS) at low atmospheric depths, where the number of secondary electrons remains high, enhancing the sensitivity to primary gamma and cosmic rays. The array comprises 5,300 plastic scintillator detectors arranged in a compact central region, complemented by 1,040 detectors forming a peripheral veto area. This...
The Jiangmen Underground Neutrino Observatory (JUNO) is a multipurpose neutrino experiment with the primary goals of the determining the neutrino mass ordering and precisely measuring oscillation parameters. The JUNO detector construction was completed at the end of 2024. It generate about 3 petabytes of data annually, requiring extensive offline processing. This processing, which is called...
The HIBEAM/NNBAR program at the European Spallation Source is designed to search for baryon number–violating processes through high-sensitivity studies of neutron oscillations, such as neutron–antineutron transitions. This search requires the development of an annihilation detector since the annihilation signal will be composed of multiple secondary particles, primarily charged and neutral...
Multiple visualization methods have been implemented in the Jiangmen Underground Neutrino Observatory (JUNO) experiment and its satellite experiment JUNO-TAO. These methods include event display software developed based on ROOT and Unity. The former is developed based on the JUNO offline software system and ROOT EVE, which provides an intuitive way for users to observe the detector geometry,...
Detector and event visualization software is essential for modern high-energy physics (HEP) experiment. It plays important role in the whole life circle of any HEP experiment, from detector design, simulation, reconstruction, detector construction and installation, to data quality monitoring, physics data analysis, education and outreach. In this talk, we will discuss two frameworks and their...
Simulation-based inference (SBI) is a set of statistical inference approaches in which Machine Learning (ML) algorithms are trained to approximate likelihood ratios. It has been shown to provide an alternative to the likelihood fits commonly performed in HEP analyses. SBI is particularly attractive in analyses performed over many dimensions, in which binning data would be computationally...
Recent years have seen growing interest in leveraging secondary cosmic ray muons for tomographic imaging of large and unknown volumes. A key area of application is cargo scanning for border security, where muon tomography is used to detect concealed hazardous or illicit materials in trucks and shipping containers. We present recent developments in TomOpt, a Python-based, end-to-end software...
Efficient data processing using machine learning relies on heterogeneous computing approaches, but optimizing input and output data movements remains a challenge. In GPU-based workflows data already resides on GPU memory, but machine learning models requires the input and output data to be provided in specific tensor format, often requiring unnecessary copying outside of the GPU device and...
Charge particle track reconstruction is the foundation of the collider experiments. Yet, it's also the most computationally expensive part of the particle reconstruction. The innovation in tracking reconstruction with graph neural networks (GNNs) has shown the promising capability to cope with the computing challenges posed by the High-Luminosity LHC (HL-LHC) with Machine learning. However,...
The TrackML dataset, a benchmark for particle tracking algorithms in High-Energy Physics (HEP), presents challenges in data handling due to its large size and complex structure. In this study, we explore using a heterogeneous graph structure combined with the Hierarchical Data Format version 5 (HDF5) not only to efficiently store and retrieve TrackML data but also to speed up the training and...
Particle Transformer has emerged as a leading model for jet tagging, but its quadratic scaling with sequence length presents significant computational challenges, especially for longer sequences. This inefficiency is critical in applications such as the LHC trigger systems where rapid inference is essential. To overcome these limitations, we evaluated several Transformer variants and...
At the Large Hadron Collider (LHC) [1] high energy proton-proton ($pp$) interactions, known as \textit{hard scatters}, are produced in contrast to low energy inelastic proton-proton collisions, referred to as \textit{pile-up}. From the perspective of experimental measurements, hard scatter events are processes of interest, whilst pile-up is conceptually no different from noise. Experiments,...
We present an end-to-end track reconstruction algorithm based on Graph Neural Networks (GNNs) for the main drift chamber of the BESIII experiment at the BEPCII collider. The algorithm directly processes detector hits as input to simultaneously predict the number of track candidates and their kinematic properties in each event. By incorporating physical constraints into the model, the...
Weakly supervised anomaly detection has been shown to be a sensitive and robust tool for Large Hadron Collider (LHC) analysis. The effectiveness of these methods relies heavily on the input features of the classifier, influencing both model coverage and the detection of low signal cross sections. In this talk, we demonstrate that improvements in both areas can be achieved by using energy flow...
The Production and Distributed Analysis (PanDA) workload management system was designed with flexibility to adapt to emerging computing technologies in processing, storage, networking, and distributed computing middleware for the global data distribution. PanDA can coordinate processing over heterogeneous computing resources, including dozens of geographically separated high-performance...
This study evaluates the portability, performance, and adaptability of the Liquid Argon TPC (LAr TPC) detector simulations on different HPC platforms, specifically Polaris, Frontier, and Perlmutter. Lar TCP workflow is a computationally complex workflow which mimics neutrino interactions and the resultant detector responses in a modular liquid argon TPC, integrating various subsystems to...
Statistical analyses in high energy physics often rely on likelihood functions of binned data. These likelihood functions can then be used for the calculation of test statistics in order to assess the statistical significance of a measurement.
evermore is a python package for building and evaluating these likelihood functions using JAX – a powerful python library for high performance...
The Next Generation Trigger project aims to improve the computational efficiency of the CMS reconstruction software (CMSSW) to increase the data processing throughput at the High-Luminosity Large Hadron Collider. As part of this project, this work focuses on improving the common Structure of Arrays (SoA) used in CMSSW for running both on CPUs and GPUs. We introduce a new SoA feature that...
The Online Data Quality Monitoring (DQM) of the CMS electromagnetic calorimeter (ECAL) is a crucial tool that allows ECAL experts to quickly identify, localize, and diagnose a broad range of detector issues that would impact on the quality of the data for physics. A real-time autoencoder-based anomaly detection system using semi-supervised machine learning is presented enabling the detection...
Particle tracking is among the most sophisticated and complex parts of the full event reconstruction chain. Various reconstruction algorithms work in sequence to build trajectories from detector hits. Each of these algorithms requires numerous configuration parameters that need fine-tuning to properly account for the detector/experimental setup, the available CPU budget, and the desired...
Generative networks are an exciting tool for fast LHC event generation. Usually, they
are used to generate configurations with a fixed number of particles. Autoregressive
transformers allow us to generate events with variable numbers of particles, very much
in line with the physics of QCD jet radiation. We show how they can learn a factorized
likelihood for jet radiation and extrapolate in...
Measurements and observations in Particle Physics fundamentally depend on one's ability to quantify their uncertainty and, thereby, their significance. Therefore, as Machine Learning methods become more prevalent in HEP, being able to determine the uncertainties of an ML method becomes more important. A wide range of possible approaches has been proposed, however, there has not been a...
In the end-cap region of the SPD detector complex, particle identification will be provided by a Focusing Aerogel RICH detector (FARICH). FARICH will primarily aid with pion / kaon separation in final open charmonia states (momenta below 5 GeV/c). A free-running (triggerless) data acquisition pipeline to be employed in the SPD results in a high data rate necessitating new approaches to event...
The availability of precise and accurate simulation is a limiting factor for interpreting and forecasting data in many fields of science and engineering. Often, one or more distinct simulation software applications are developed, each with a relative advantage in accuracy or speed. The quality of insights extracted from the data stands to increase if the accuracy of faster, more economical...
I will present joint work on the behavior of Feynman integrals and perturbative expansions at large loop orders. Using the tropical sampling algorithm for evaluating Feynman integrals, along with a dedicated graph-sampling algorithm to generate representative sets of Feynman diagrams, we computed approximately $10^7$ integrals with up to 17 loops in four-dimensional $\phi^4$ theory. Through...
The development of radiation-hard CMOS Monolithic Active Pixel Sensors (MAPS) is a key advancement for next-generation high-energy physics experiments. These sensors offer improved spatial resolution and integration capabilities but require efficient digital readout and data acquisition (DAQ) systems to operate in high-radiation environments. My research focuses on the FPGA-based digital...
Precise simulation-to-data corrections, encapsulated in scale factors, are crucial for achieving high precision in physics measurements at the CMS experiment. Traditional methods often rely on binned approaches, which limit the exploitation of available information and require a time-consuming fitting process repeated for each bin. This work presents a novel approach utilizing modern...
The performance of Particle Identification (PID) in the LHCb experiment is critical for numerous physics analyses. Classifiers, derived from detector likelihoods under various particle mass hypotheses, are trained to tag particles using calibration samples that involve information from the Ring Imaging Cherenkov (RICH) detectors, calorimeters, and muon identification chambers. However, these...
As generative models start taking an increasingly prominent role in both particle physics and everyday life, quantifying the statistical power and expressiveness of such generative models becomes a more and more pressing question.
In past work, we have seen that a generative mode can, in fact, be used to generate samples beyond the initial training data. However, the exact quantification...
The simulation of particle interactions with detectors plays a critical role in understanding the detector performances and optimizing physics analysis. Without the guidance of the first-principle theory, the current state-of-the-art simulation tool, \textsc{Geant4}, exploits phenomenology-inspired parametric models, which must be combined and carefully tuned to experimental observations. The...
Track reconstruction is one of the most important and challenging tasks in the offline data processing of collider experiments. The Super Tau-Charm Facility (STCF) is a next-generation electron-positron collider running in the tau-charm energy region proposed in China, where conventional track reconstruction methods face enormous challenges from the higher background environment introduced by...
Simulating physics processes and detector responses is essential in high energy physics but accounts for significant computing costs. Generative machine learning has been demonstrated to be potentially powerful in accelerating simulations, outperforming traditional fast simulation methods. While efforts have focused primarily on calorimeters initial studies have also been performed on silicon...
The CMS experiment requires massive computational resources to efficiently process the large amounts of detector data. The computational demands are expected to grow as the LHC enters the high-luminosity era. Therefore, GPUs will play a crucial role in accelerating these workloads. The approach currently used in the CMS software (CMSSW) submits individual tasks to the GPU execution queues,...
The CMS experiment requires massive computational resources to efficiently process the large amounts of detector data. The computational demands are expected to grow as the LHC enters the high-luminosity era. Therefore, GPUs will play a crucial role in accelerating these workloads. The approach currently used in the CMS software (CMSSW) relies on explicit memory management techniques, where...
Precision measurements of particle properties, such as the leading hadronic contribution to the muon magnetic moment anomaly, offer critical tests of the Standard Model and probes for new physics. The MUonE experiment aims to achieve this through precise reconstruction of muon-electron elastic scattering events using silicon strip tracking stations and low-Z targets, while accounting for...
In my on-going Master thesis, under the supervision of Professor Matthias Schott and with the assistance of Dhruv Chouhan, I am attempting to do a GridPix detector characterization, as the GridPix is our group's proposed active neutrino detector technology to replace the current emulsion layers of the FASERν detector in the near future. After a series of test beams at ELSA accelerator in Bonn...
Machine learning (ML), a cornerstone of data science and statistical analysis, autonomously constructs hierarchical mathematical models—such as deep neural networks—to extract complex patterns and relationships from data without explicit programming. This capability enables accurate predictions and the extraction of critical insights, making ML a transformative tool across scientific...
In the ATLAS experiment, colliding proton-proton bunches produce multiple primary vertices per bunch crossing. Typically, the primary vertex with the highest sum of squared transverse momentum of associated tracks is designated as the hard-scatter (HS) vertex, serving as the reference point for all physics objects in the event. However, this method proves suboptimal for scenarios with low...
Machine Learning (ML) plays an important role in physics analysis in High Energy Physics. To achieve better physics performance, physicists are training larger and larger models with larger dataset. Therefore, many workflow developments focus on distributed training of large ML models, inventing techniques like model pipeline parallelism. However, not all physics analyses need to train large...
The calibration of Belle II data involves two key processes: prompt
calibration and reprocessing. Prompt calibration represents the initial
step in continuously deriving calibration constants in a timely manner for
the data collected over the previous couple of weeks. Currently, this process
is managed by b2cal, a Python-based plugin built on Apache
Airflow to handle calibration jobs....
We present a novel integration of the PanDA workload management system (PanDA WMS) and Harvester with Globus Compute to enable secure, portable, and remote execution of ATLAS workflows on high-performance computing (HPC) systems. In our approach, Harvester, which runs on an external server, is used to orchestrate job submissions via Globus Compute’s multi-user endpoint (MEP). This MEP provides...
Unfolding can be considered a procedure for estimating an unknown probability density function. Both external and internal quality assessment methods can be used for this purpose.
In some cases, external criteria exist that allow for the gauging of the quality of deconvolution. A typical example is the deconvolution of a blurred image, where the sharpness of the unblurred image can be used...
Attention-based transformer models are increasingly vital for analyzing the extensive datasets generated by particle physics experiments at the CERN LHC. In this work, we conduct an interpretability study of the Particle Transformer (ParT), a state-of-the-art model designed for jet-tagging tasks that are essential for identifying particles from proton collisions. Through detailed analysis of...
Despite compelling evidence for the incompleteness of the Standard Model and an extensive search programme, no hints of new physics have so far been observed at the LHC. Anomaly detection was proposed as way to enhance the sensitivity of generic searches not targetting any specific signal model. One of the leading methods in this field, CATHODE (Classifying Anomalies THrough Outer Density...
The new fully software-based trigger of the LHCb experiment operates at a 30 MHz data rate and imposes tight constraints on GPU execution time. Tracking reconstruction algorithms in this first-level trigger must efficiently select detector hits, group them, build tracklets, account for the LHCb magnetic field, extrapolate and fit trajectories, and select the best track candidates to filter...
Recent advancements in large language models (LLMs) have paved the way for tools that can enhance the software development process for scientists. In this context, LLMs excel at two tasks -- code documentation in natural language and code generation in a given programming language. The commercially available tools are often restricted by the available context window size, encounter usage...
The data processing and analyzing is one of the main challenges at HEP experiments. To accelerate the physics analysis and drive new physics discovery, the rapidly developing Large Language Model (LLM) is the most promising approach, it have demonstrated astonishing capabilities in recognition and generation of text while most parts of physics analysis can be benefitted. In this talk we will...
In view of reducing the disk size of the future analysis formats for High Luminosity LHC in the ATLAS experiment, we have explored the use of lossy compression in the newly developed analysis format known as PHYSLITE. Improvements in disk size are being obtained in migrating from the 'traditional' ROOT TTree format to the newly developed RNTuple format. Lossy compression can bring improvements...
The LHCf experiment aims to study forward neutral particle production at the LHC, providing crucial data for improving hadronic interaction models used in cosmic ray physics. A key challenge in this context is the
measurement of $(K^0)$ production, indirectly reconstructed from the four photons originated by its decay. The main challenge in this measurement is the reconstruction of events...
Ensuring the quality of data in large HEP experiments such as CMS at the LHC is crucial for producing reliable physics outcomes, especially in view of the high-luminosity phase of the LHC, where the new data-taking conditions will require much more careful monitoring of the experimental apparatus. The CMS protocols for Data Quality Monitoring (DQM) rely on the analysis of a standardized set of...
This contribution discusses an anomaly detection search for narrow-width resonances beyond the Standard Model that decay into a pair of jets. Using 139 fb−1 of proton-proton collision data at sqrt(s) = 13 TeV, recorded from 2015 to 2018 with the ATLAS detector at the Large Hadron Collider, we aim to identify new physics without relying on a specific signal model. The analysis employs two...
Cluster counting is a highly promising particle identification technique for drift chambers in particle physics experiments. In this paper, we trained neural network models, including a Long Short-Term Memory (LSTM) model for the peak-finding algorithm and a Convolutional Neural Network (CNN) model for the clusterization algorithm, using various hyperparameters such as loss functions,...
Weakly supervised anomaly detection has been shown to find new physics with a high significance at low injected signal cross sections. If the right features and a robust classifier architecture are chosen, these methods are sensitive to a very broad class of signal models. However, choosing the right features and classification architecture in a model-agnostic way is a difficult task as the...
Visualizing pre-binned histograms is a HEP domain specific concern which is not adequately supported within the greater pythonic ecosystem. In recent years, [mplhep][1] has emerged as a leading package providing this basic functionality in a user-friendly interface. It also supplies styling templates for the four big LHC experiments - ATLAS, CMS, LHCb, and ALICE. At the same time, the...
In a physics data analysis, "fake" or non-prompt backgrounds refer to events that would not typically satisfy the selection criteria for a given signal region, but are nonetheless accepted due to misreconstructed particles. This can occur, for example, when particles from secondary decays are incorrectly identified as originating from the hard scatter interaction point (resulting in non-prompt...
The ServiceX project aims to provide a data extraction and delivery service for HEP analysis data, accessing files from distributed stores and applying user-configured transformations on them. ServiceX aims to support many existing analysis workflows and tools in as transparent a manner as possible, while enabling new technologies. We will discuss the most recent backends added to ServiceX,...
Classical searches for BSM physics at the LHC suffer from two shortcomings: they tend to be dependent on one particular BSM model and they rarely include the information of the full, high-dimensional physical phase space. Recently, machine learning has been successfully applied to enhance resonant searches at LHC experiments addressing both shortcomings. In this talk we explore options to...
In view of the HL-LHC, the Phase-2 CMS upgrade will replace the entire trigger and data acquisition system. The L1T system has been designed to process 63 Tb/s input bandwidth with state-of-the-art commercial FPGAs and high-speed optical links reaching up to 28 Gb at a fixed latency below 12.5 µs. In view of the upgraded trigger system and in preparation for the HL-LHC, a GNN has been trained...
To compare collider experiments, measured data must be corrected for detector distortions through a process known as unfolding. As measurements become more sophisticated, the need for higher-dimensional unfolding increases, but traditional techniques have limitations. To address this, machine learning-based unfolding methods were recently introduced. In this work, we introduce OmniFold-HI, an...
Noise is a central challenge in quantum computing, particularly on Noisy Intermediate-Scale Quantum (NISQ) devices, where it significantly impacts the reliability of computations and model performance.
Variational quantum circuits, exhibit some inherent noise resilience due to their trainable structure and adaptability. While the effects of noise have been studied in the context of quantum...
Online reconstruction plays a crucial role in monitoring and real-time analysis of High Energy and Nuclear Physics experiments. A vital aspect of reconstruction algorithms is particle identification (PID), which combines information from various detector components to determine a particle's type. Electron identification is particularly significant in electro-production Nuclear Physics...
The upgrade of the CMS apparatus for the HL-LHC will provide unprecedented timing measurement capabilities, in particular for charged particles through the Mip Timing Detector (MTD). One of the main goals of this upgrade is to compensate the deterioration of primary vertex reconstruction induced by the increased pileup of proton-proton collisions by separating clusters of tracks not only in...
In this work, we present a set of optimizations to the Particle Transformer (ParT), a state-of-the-art model for jet classification, aimed at reducing inference time and memory usage while preserving accuracy. To address the compute and memory bottlenecks of traditional attention mechanisms, we incorporate FlashAttention and memory-efficient attention, enabling exact attention computation...
Maintaining high data quality in large HEP experiments like CMS at the LHC is essential for obtaining reliable physics results. The LHC high-luminosity phase will introduce higher event rates, requiring more sophisticated monitoring techniques to promptly identify and address potential issues. The CMS protocols for Data Quality Monitoring (DQM) and Data Certification (DC) rely on significant...
When we try to move the software named ‘Hepsptycho’, which is a ptychography reconstruction program originally based on multiple Nvidia GPU and MPI techs, to run on the Hygon DCU architectures, we found that the reconstructed object and probe encountered an error while the results running on Nvidia GPUs are correct. We profiled the ePIE algorithm using NVIDIA Nsight Systems and Hygon's...
Workflow tools provide the means to codify complex multi-step processes, thus enabling reproducibility, preservation, and reinterpretation efforts. Their powerful bookkeeping also directly supports the research process, especially where intermediate results are produced, inspected, and iterated upon frequently.
In Luigi, such a complex workflow graph is composed of individual tasks that...
At many Worldwide LHC Computing Grid (WLCG) sites, HPC resources are already integrated, or will be integrated in the near future, into the experiment specific workflows. The integration can be done either in an opportunistic way to use otherwise unused resources for a limited period of time, or in a permanent way. The WLCG ATLAS Tier-2 cluster in Freiburg has been extended in both ways:...
The High-Level Trigger (HLT) of the Compact Muon Solenoid (CMS) processes event data in real time, applying selection criteria to reduce the data rate from hundreds of kHz to around 5 kHz for raw data offline storage. Efficient lossless compression algorithms, such as LZMA and ZSTD, are essential in minimizing these storage requirements while maintaining easy access for subsequent analysis....
We provide a performance portable implementation of SU(N) lattice gauge theory (LGT) simulations using the Kokkos parallel programming model, enabling efficient execution across diverse architectures, including x86 CPUs, Arm CPUs, and GPUs. By leveraging Kokkos’s abstractions for parallel execution and memory management, we map the gauge field operations of SU(N) LGT onto heterogeneous...
Simulating showers of particles in highly-granular detectors is a key frontier in the application of machine learning to particle physics. Achieving high accuracy and speed with generative machine learning models can enable them to augment traditional simulations and alleviate a major computing constraint.
Recent developments have shown how diffusion based generative shower simulation...
Significant computing resources are used for parton-level event generation for the Large Hadron Collider (LHC). The resource requirements of this part of the simulation toolchain are expected to grow further in the High-Luminosity (HL-LHC) era. At the same time, the rapid deployment of computing hardware different from the traditional CPU+RAM model in data centers around the world mandates a...
Artificial Intelligence is set to play a transformative role in designing large and complex detectors, such as the ePIC detector at the upcoming Electron-Ion Collider (EIC). The ePIC setup features a central detector and additional systems positioned in the far forward and far backward regions. Designing this system involves balancing many factors—performance, physics goals, and cost—while...
Quantum generative modeling provides an alternative framework for simulating complex processes in high-energy physics. Calorimeter shower simulations, in particular, involve high-dimensional, stochastic data and are essential for particle identification and energy reconstruction at experiments such as those at the LHC. As these simulations increase in complexity—especially in large-scale...
The $Z_c(3900)$ was first discovered by the Beijing Spectrometer (BESIII) detector in 2013. As one of the most attractive discoveries of the BESIII experiment, $Z_c(3900)$ itself has inspired extensive theoretical and experimental research on its properties. In recent years, the rapid growth of massive experimental data at high energy physics (HEP) experiments have driven a lot of novel...
Machine learning algorithms are being used more frequently in the {first-level triggers in collider experiments}, with Graph Neural Networks (GNNs) pushing the hardware requirements of FPGA-based triggers beyond the current state of the art. {As a first online event processing stage, first-level trigger systems process $O({10}~{\text{M}})$ events per second with a hard real-time latency...
The CMS experiment at the LHC has entered a new phase in real-time data analysis with the deployment of two complementary unsupervised anomaly detection algorithms during Run 3 data-taking. Both algorithms aim to enhance the discovery potential for new physics by enabling model-independent event selection directly at the hardware trigger level, operating at the 40 MHz LHC collision rate within...
The Analysis Grand Challenge (AGC) showcases an example of HEP analysis. Its reference implementation uses modern Python packages to realize the main steps, from data access to statistical model building and fitting. The packages used for data handling and processing (coffea, uproot, awkward-array) have recently undergone a series of performance optimizations.
While not being part of the HEP...
In recent years, Awkward Array, Uproot, and related packages have become the go-to solutions for performing High-Energy Physics (HEP) analyses. Their development is driven by user experience and feedback, with the community actively shaping their evolution. User requests for new features and functionality play a pivotal role in guiding these projects.
For example, the Awkward development...
One of the main points of object reconstruction is the definition of the targets of our reconstruction. In this talk we present recent developments on the topic, focusing on how we can embed detector constraints, mainly calorimeter granularity, in our truth information and how this can impact the performance of the reconstruction, in particular for Machine Learning based approaches. We will...
The simulation of calorimeter showers is computationally expensive, leading to the development of generative models as an alternative. Many of these models face challenges in balancing generation quality and speed. A key issue damaging the simulation quality is the inaccurate modeling of distribution tails. Normalizing flow (NF) models offer a trade-off between accuracy and speed, making them...
To standardize the evaluation of computational capabilities across various hardware architectures in data centers, we developed a CPU performance benchmarking tool within the HEP-Score framework. The tool uses the JUNO offline software as a realistic workload and produces standardized outputs aligned with HEP-Score requirements. Our tests demonstrate strong linear performance characteristics...
Astronomical satellites serve as critical infrastructure in the field of astrophysics, and data processing is one of the most essential processes for conducting scientific research on cosmic evolution, celestial activities, and dark matter. Recent advancements in satellite sensor resolution and sensitivity have led to petabyte (PB)-scale data volumes, characterized by unprecedented scale and...
The Belle II experiment at the SuperKEKB accelerator in Tsukuba, Japan, searches for physics beyond the Standard Model, with a focus on precise measurements of flavor physics observables. Highly accurate Monte Carlo simulations are essential for this endeavor, as they must correctly model the variations in detector conditions and beam backgrounds that occur during data collection. To meet this...
The HEP communities have developed an increasing interest in High Performance Computing (HPC) centres, as these hold the potential of providing significant computing resources to the current and future experiments. At the Large Hadron Collider (LHC), the ATLAS and CMS experiments are challenged with a scale up of several factors in computing for the High Luminosity LHC (HL-HLC) Run 4,...
The HEP communities have developed an increasing interest in High Performance Computing (HPC) centres, as these hold the potential of providing significant computing resources to the current and future experiments. At the Large Hadron Collider (LHC), the ATLAS and CMS experiments are challenged with a scale up of several factors in computing for the High Luminosity LHC (HL-HLC) Run 4,...
Hyperparameter optimization plays a crucial role in achieving high performance and robustness for machine learning models, such those used in complex classification tasks in High Energy Physics (HEP).
In this study, we investigate and experience the usage of $\texttt{Optuna}$, a rather new, modern and scalable optimization tool in the framework of a realistic signal-versus-background...
In large-scale distributed computing systems, workload dispatching and the associated data management are critical factors that determine key metrics such as resource utilization of distributed computing and resilience of scientific workflows. As the Large Hadron Collider (LHC) advances into its high luminosity era, the ATLAS distributed computing infrastructure must improve these metrics to...
While physical systems are often described in high-dimensional spaces, they frequently exhibit hidden low-dimensional structures. A powerful way to exploit this characteristic is through sparsity. In this talk, we explore the role of sparsity in neural networks in two key contexts: (1) in generative models, particularly diffusion models, where we demonstrate how sparsity can accelerate the...
The High Luminosity Large Hadron Collider (HL-LHC) and future big science experiments will generate unprecedented volumes of data, necessitating new approaches to physics analysis infrastructure. We present the SubMIT Physics Analysis Facility, an implementation of the emerging Analysis Facilities (AF) concept at MIT. Our solution combines high-throughput computing capabilities with modern...
The LHCb collaboration is currently using a pioneer system of data filtering in the trigger system, based on real-time particle reconstruction using Graphics Processing Units (GPUs). This corresponds to processing 5 TB/s of data and has required a huge amount of hardware and software developments. Among them, the corresponding power consumption and sustainability is an imperative matter in...
The JUNO offline software (JUNOSW) is built upon the SNiPER framework. Its multithreaded extension, MT-SNiPER, enables inter-event parallel processing and has successfully facilitated JUNOSW's parallelization. Over the past year, two rounds of JUNO Data Challenge (DC) have been conducted to validate the complete data processing chain. During these DC tasks, the performance of MT-SNiPER was...
The ASTRI (Astrofisica con Specchi a Tecnologia Replicante Italiana) Project was born as a collaborative international effort led by the Italian National Institute for Astrophysics (INAF) to design and realize an end-to-end prototype of the Small-Sized Telescope (SST) of the Cherenkov Telescope Array (CTA) in a dual-mirror configuration (2M). The prototype, named ASTRI-Horn, has been...
Jet tagging, i. e. determining the origin of high-energy hadronic jets, is a key challenge in particle physics. Jets are ubiquitous observables in collider experiments, made of complex collections of particles, that need to be classified. Over the past decade, machine learning-based classifiers have greatly enhanced our jet tagging capabilities, with increasingly sophisticated models driving...
The High Energy Photon Source produces vast amounts of diverse, multi-modal data annually, with IO bottlenecks increasingly limiting scientific computational efficiency. To overcome this challenge, our approach introduces a threefold solution. First, we develop daisy-io, which has a unified IO interface designed for cross-disciplinary applications, which integrates accelerated data retrieval...
The Jiangmen Underground Neutrino Observatory (JUNO) aims to determine the neutrino mass ordering (NMO) with a 3-sigma confidence level within six years. The experiment is currently in the commissioning phase, focusing on filling the liquid scintillator and evaluating detector performance. During physics data taking, the expected data rate after the global trigger is approximately 40 GB/s,...
The increasing reliance on machine learning (ML) and particularly deep learning (DL) in scientific and industrial applications requires models that are not only accurate, but also reliable under varying conditions. This is especially important for automated machine learning and fault-tolerant systems where there is limited or no human control. In this paper, we present a novel,...
Identifying products of ultrarelativistic collisions delivered by the LHC and RHIC colliders is one of the crucial objectives of experiments such as ALICE and STAR, which are specifically designed for this task. They allow for a precise Particle Identification (PID) over a broad momentum range.
Traditionally, PID methods rely on hand-crafted selections, which compare the recorded signal of...
Patrick Asenov (Universita & INFN Pisa (IT)), Anna Driutti (Universita & INFN Pisa (IT)), Mateusz Jacek Goncerz (Polish Academy of Sciences (PL)), Emma Hess (Universita & INFN Pisa (IT)), Marcin Kucharczyk (Polish Academy of Sciences (PL)), Damian Mizera (Cracow University of Technology (PL)), Marcin Wolter (Polish Academy of Sciences (PL)), Milosz Zdybal (Polish Academy of Sciences...
Graph Neural Networks (GNNs) have been in the focus of machine-learning-based track reconstruction for high-energy physics experiments during the last years. Within ATLAS, the GNN4ITk group has investigated this type of algorithm for track reconstruction at the High-Luminosity LHC (HL-LHC) using the future full-silicon Inner Tracker (ITk).
The Event Filter (EF) is part of the ATLAS Trigger...
Several methods are presented that reconstruct charged particles with lifetimes between 10 ps and 10 ns in real-time. One of the methods considers a combination of their decay products and the partial tracks created by the initial charged particle. The other methods do not require the reconstruction of the decay products of the charged particles, hence enabling the reconstruction of decays...
Reconstructing particle trajectories is a significant challenge in most particle physics experiments and a major consumer of CPU resources. It can typically be divided into three steps: seeding, track finding, and track fitting. Seeding involves identifying potential trajectory candidates, while track finding entails associating detected hits with the corresponding particle. Finally, track...
As the High-Luminosity LHC (HL-LHC) era approaches, significant improvements in
reconstruction software are required to keep pace with the increased data rates and
detector complexity. A persistent challenge for high-throughput event reconstruction is
the estimation of track parameters, which is traditionally performed using iterative
Kalman Filter-based algorithms. While GPU-based track...
Searches for long-lived particles (LLPs) have attracted much interest lately due to their high discovery potential in the LHC Run-3. Signatures featuring LLPs with long lifetimes and decaying inside the muon detectors of the CMS experiment at CERN are of particular interest. In this talk, we will describe a novel Level-1 trigger algorithm that significantly improves CMS's signal efficiency for...
The ROOT software package is a widely used data analysis framework in the High Energy Physics (HEP) community. As many other Python packages, ROOT features a powerful performance-oriented core which can be accessed in a Python application thanks to dynamic bindings. These facilitate the usage and integration of ROOT with the broader Python ecosystem.
Despite these capabilities, there is...
The KEDR experiment is ongoing at the VEPP-4M $e^{+}e^{-}$ collider at Budker
INP in Novosibirsk. The collider center of mass energy range covers
a wide spectrum from 2 to 11 GeV. Most of the up-to-date statistics were
taken at the lower end of the energy range around the charmonia region.
Activities at greater energies up to the bottomonia lead to
a significant increase of event...
With the consequences of global warming becoming abundantly clear, physics research needs to do its part in becoming more sustainable, including its computing aspects. Many measures in this field expend great effort to keep the impact on users minimal. However, even greater savings can be gained when compromising on these expectations.
In any such approach affecting the user experience, the...
The Belle~II electromagnetic calorimeter (ECL) is not only used for measuring electromagnetic particles but also for identifying and determining the position of hadrons, particularly neutral hadrons.
Recent data-taking periods have presented two challenges for the current clustering method:
Firstly, the record-breaking luminosities achieved by the SuperKEKB accelerator have increased...
ServiceX, a data extraction and delivery service for HEP experiments, is being used in ATLAS to prepare training data for a long-lived particle search. The training data contains low-level features not available in the ATLAS experiment’s PHYSLITE format - making ServiceX’s ability to read complex event data (e.g. ATLAS’s xAOD format) ideally suited to solving this problem. This poster will...
The speed and fidelity of detector simulations in particle physics pose compelling questions on future LHC analysis and colliders. The sparse high-dimensional data combined with the required precision provide a challenging task for modern generative networks. We present a general framework to train generative networks on any detector geometry with minimal user input. Vision transformers allow...
The ROOT software framework is widely used in HEP for storage, processing, analysis and visualization of large datasets. With the large increase in usage of ML for experiment workflows, especially lately in the last steps of the analysis pipeline, the matter of exposing ROOT data ergonomically to ML models becomes ever more pressing. In this contribution we discuss the experimental component...