Conveners
Track 2: Data Analysis - Algorithms and Tools
- Abhijith Gandrakota (Fermi National Accelerator Lab. (US))
- Javier Mauricio Duarte (Univ. of California San Diego (US))
Track 2: Data Analysis - Algorithms and Tools
- Nhan Tran (Fermi National Accelerator Lab. (US))
- Javier Mauricio Duarte (Univ. of California San Diego (US))
Track 2: Data Analysis - Algorithms and Tools
- Joosep Pata (National Institute of Chemical Physics and Biophysics (EE))
- Yihui Ren
Track 2: Data Analysis - Algorithms and Tools
- Yihui Ren
- Haiwang Yu
Track 2: Data Analysis - Algorithms and Tools
- Gage DeZoort (Princeton University (US))
- Aleksandra Ciprijanovic (Fermi National Accelerator Laboratory)
Track 2: Data Analysis - Algorithms and Tools
- Lindsey Gray (Fermi National Accelerator Lab. (US))
- Lukas Alexander Heinrich (Technische Universitat Munchen (DE))
Track 2: Data Analysis - Algorithms and Tools
- Lukas Alexander Heinrich (Technische Universitat Munchen (DE))
- Lindsey Gray (Fermi National Accelerator Lab. (US))
Based on the tracking experience at LHC, the project, A Common Tracking Software (ACTS), aims to provide an open-source experiment-independent and framework-independent software designed for modern computing architectures. It provides a set of high-level performant track reconstruction tools which are agnostic to the details of the detection technologies and magnetic field configuration, and...
Charged particle reconstruction is one the most computationally heavy components of the full event reconstruction of Large Hadron Collider (LHC) experiments. Looking to the future, projections for the High Luminosity LHC (HL-LHC) indicate a superlinear growth for required computing resources for single-threaded CPU algorithms that surpass the computing resources that are expected to be...
In particle physics, machine learning algorithms traditionally face a limitation due to the lack of truth labels in real data, restricting training to only simulated samples. This study addresses this challenge by employing self-supervised learning, which enables the utilization of vast amounts of unlabeled real data, thereby facilitating more effective training.
Our project is particularly...
The sub-optimal scaling of traditional tracking algorithms based on combinatorial Kalman filters causes performance concerns for future high-pileup experiments like the High Luminosity Large Hadron Collider. Graph Neural Network-based tracking approaches have been shown to significantly improve scaling at similar tracking performance levels. Rather than employing the popular edge...
A new algorithm, called "Downstream", has been developed at LHCb which is able to reconstruct and select very displaced vertices in real time at the first level of the trigger (HLT1). It makes use of the Upstream Tracker (UT) and the Scintillator Fiber detector (SciFi) of LHCb and it is executed on GPUs inside the Allen framework. In addition to an optimized strategy, it utilizes a Neural...
Liquid Argon Time Projection Chamber, or LArTPC, is a scalable tracking calorimeter featuring rich event topology information. It provides the core detector technology for many current and next-generation large-scale neutrino experiments, such as DUNE and the SBN program. In neutrino experiments, LArTPC faces numerous challenges in both hardware and software to achieve optimum performance. On...
The Jiangmen Underground Neutrino Observatory (JUNO), located in Southern China, is a multi-purpose neutrino experiment that consists of a 20-kton liquid scintillator detector. The primary goal of the experiment is to determine the neutrino mass ordering (NMO) and measure other neutrino oscillation parameters to sub-percent precision. Atmospheric neutrinos are sensitive to NMO via matter...
Track reconstruction is a crucial task in particle experiments and is traditionally very computationally expensive due to its combinatorial nature. Many recent developments have explored new tracking algorithms in order to improve scalability in preparation of the HL-LHC. In particular, Graph neural networks (GNNs) have emerged as a promising approach due to the graph nature of particle...
Tracking is one of the most crucial components of reconstruction in the collider experiments. It is known for high consumption of computing resources, and various innovations have been being introduced until now. Future colliders such as the High-Luminosity Large Hadron Collider (HL-LHC) will face further enormously increasing demand of the computing resources. Usage of cutting-edge artificial...
The emergence of models pre-trained on simple tasks and then fine-tuned to solve many downstream tasks has become a mainstay for the application of deep learning within a large variety of domains. The models, often referred to as foundation models, aim, through self-supervision, to simplify complex tasks by extracting the most salient features of the data through a careful choice of...
Particle detectors play a pivotal role in the field of high-energy physics. Traditionally, detectors are characterized by their responses to various particle types, gauged through metrics such as energy or momentum resolutions. While these characteristics are instrumental in determining particle properties, they fall short of addressing the initial challenge of reconstructing particles.
We...
In this work we demonstrate that significant gains in performance and data efficiency can be achieved moving beyond the standard paradigm of sequential optimization in High Energy Physics (HEP). We conceptually connect HEP reconstruction and analysis to modern machine learning workflows such as pretraining, finetuning, domain adaptation and high-dimensional embedding spaces and quantify the...
Foundation models have revolutionized natural language processing, demonstrating exceptional capabilities in handling sequential data. Their ability to generalize across tasks and datasets offers promising applications in high energy physics (HEP). However, collider physics data, unlike language, involves both continuous and discrete data types, including four-vectors, particle IDs, charges,...
Accurately reconstructing particles from detector data is a critical challenge in experimental particle physics. The detector's spatial resolution, specifically the calorimeter's granularity, plays a crucial role in determining the quality of the particle reconstruction. It also sets the upper limit for the algorithm's theoretical capabilities. Super-resolution techniques can be explored as a...
Hadronization is a critical step in the simulation of high-energy particle and nuclear physics experiments. As there is no first principles understanding of this process, physically-inspired hadronization models have a large number of parameters that are fit to data. We propose an alternative approach that uses deep generative models, which are a natural replacement for classical techniques,...
High Energy Physics (HEP) experiments rely on scientific simulation to develop reconstruction algorithms. Despite the remarkable fidelity of modern simulation frameworks, residual discrepancies between simulated and real data introduce a challenging domain shift problem. The existence of this issue raises significant concerns regarding the feasibility of implementing Deep Learning (DL) methods...
Statistical anomaly detection empowered by AI is a subject of growing interest at collider experiments, as it provides multidimensional and highly automatized solutions for signal-agnostic data quality monitoring, data validation and new physics searches.
AI-based anomaly detection techniques mainly rely on unsupervised or semi-supervised machine learning tasks. One of the most crucial and...
The Jiangmen Underground Neutrino Observatory (JUNO) is a next-generation large (20 kton) liquid-scintillator neutrino detector, which is designed to determine the neutrino mass ordering from its precise reactor neutrino spectrum measurement. Moreover, high-energy (GeV-level) atmospheric neutrino measurements could also improve its sensitivity to mass ordering via matter effects on...
Supervised learning has been used successfully for jet classification and to predict a range of jet properties, such as mass and energy. Each model learns to encode jet features, resulting in a representation that is tailored to its specific task. But could the common elements underlying such tasks be combined in a single foundation model to extract features generically? To address this...
We describe the principles and performance of the first-level ("L1") hardware track trigger of Belle II, based on neural networks. The networks use as input the results from the standard \belleii trigger, which provides ``2D'' track candidates in the plane transverse to the electron-positron beams. The networks then provide estimates for the origin of the 2D track candidates in direction of...
Navigating the demanding landscapes of real-time and offline data processing at the Large Hadron Collider (LHC) requires the deployment of fast and robust machine learning (ML) models for advancements in Beyond Standard Model (SM) discovery. This presentation explores recent breakthroughs in this realm, focusing on the use of knowledge distillation to imbue efficient model architectures with...
Equivariant models have provided state-of-the-art performance in many ML applications, from image recognition to chemistry and beyond. In particle physics, the relevant symmetries are permutations and the Lorentz group, and the best-performing networks are either custom-built Lorentz-equivariant architectures or more generic large transformer models. A major unanswered question is whether the...
In recent years, we have seen a rapid increase in the variety of computational architectures, featuring GPUs from multiple vendors, a trend that will likely continue in the future with the rise of possibly new accelerators. The High Energy Physics (HEP) community employs a wide variety of algorithms for accelerators which are mostly vendor-specific, but there is a compelling demand to expand...
Stable operation of the detector is essential for high quality data taking in high energy physics experiment. But it is not easy to keep the detector always running stably during data taking period in environment with high beam induced background. In the BESIII experiment, serious beam related background may cause instability of the high voltages in the drift chamber which is the innermost sub...
Measuring neutrino mass ordering (NMO) poses a fundamental challenge in neutrino physics. To address this, the Jiangmen Underground Neutrino Observatory (JUNO) experiment is scheduled to commence data collection in late 2024, with the ambitious goal of determining the NMO at a 3-sigma confidence level within a span of 6 years. A key factor in achieving this is ensuring a high-quality energy...
The CMS experiment has recently established a new Common Analysis Tools (CAT) group. The CAT group implements a forum for the discussion, dissemination, organization and development of analysis tools, broadly bridging the gap between the CMS data and simulation datasets and the publication-grade plots and results. In this talk we discuss some of the recent developments carried out in the...
In the contemporary landscape of advanced statistical analysis toolkits, ranging from Bayesian inference to machine learning, the seemingly straightforward concept of a histogram often goes unnoticed. However, the power and compactness of partially aggregated, multi-dimensional summary statistics with a fundamental connection to differential and integral calculus make them formidable...
Declarative Analysis Languages (DALs) are a paradigm for high-energy physics analysis that separates the desired results from the implementation details. DALs enable physicists to use the same software to work with different experiment's data formats, without worrying about the low-level details or the software infrastructure available. DALs have gained popularity since the HEP Analysis...
Traditionally, analysis of data from experiments such as LZ and XENONnT have relied on summary statistics of large sets of simulated data, generated using emissions models for particle interactions in liquid xenon such as NEST. As these emissions models are probabilistic in nature, they are a natural candidate to be implemented in a probabilistic programming framework. This would also allow...
Scientific experiments and computations, particularly in Nuclear Physics (NP) and High Energy Physics (HEP) programs, are generating and accumulating data at an unprecedented rate. Big data presents opportunities for groundbreaking scientific discoveries. However, managing this vast amount of data cost-effectively while facilitating efficient data analysis within a large-scale, multi-tiered...
As part of the Scientific Discovery through Advanced Computing (SciDAC) program, the Quantum Chromodynamics Nuclear Tomography (QuantOM) project aims to analyze data from Deep Inelastic Scattering (DIS) experiments conducted at Jefferson Lab and the upcoming Electron Ion Collider. The DIS data analysis is performed on an event-level by leveraging nuclear theory models and accounting for...
dilax is a software package for statistical inference using likelihood
functions of binned data. It fulfils three key concepts: performance,
differentiability, and object-oriented statistical model building.
dilax is build on JAX - a powerful autodifferentiation Python frame-
work. By making every component in dilax a “PyTree”, each compo-
nent can be jit-compiled (jax.jit), vectorized...