Conveners
Track 2: Data Analysis - Algorithms and Tools
- Adriano Di Florio (Politecnico e INFN, Bari)
- Sophie Berkman
Track 2: Data Analysis - Algorithms and Tools
- Adriano Di Florio (Politecnico e INFN, Bari)
- Enrico Guiraud (EP-SFT, CERN)
Track 2: Data Analysis - Algorithms and Tools
- Tony Di Pilato (CASUS - Center for Advanced Systems Understanding (DE))
- Sophie Berkman
Track 2: Data Analysis - Algorithms and Tools
- Claudio Caputo (Universite Catholique de Louvain (UCL) (BE))
- Gregor Kasieczka (Hamburg University (DE))
Track 2: Data Analysis - Algorithms and Tools
- Felice Pantaleo (CERN)
- Dalila Salamani (CERN)
Track 2: Data Analysis - Algorithms and Tools
- Davide Valsecchi (ETH Zurich (CH))
- Thomas Owen James (Imperial College (GB))
Track 2: Data Analysis - Algorithms and Tools
- Patrick Rieck (New York University (US))
- Erica Brondolin (CERN)
Track 2: Data Analysis - Algorithms and Tools
- Piyush Raikwar
- Jennifer Ngadiuba (INFN, Milano)
Since the last decade, the so-called Fourth Industrial Revolution is
ongoing. It is a profound transformation in industry, where new tech-
nologies such as smart automation, large-scale machine-to-machine com-
munication, and the internet of things are largely changing traditional
manufacturing and industrial practices. The analysis of the huge amount
of data, collected in all modern...
Signal-background classification is a central problem in High-Energy Physics (HEP), that plays a major role for the discovery of new fundamental particles. The recent Parametric Neural Network (pNN) is able to leverage multiple signal mass hypotheses as an additional input feature to effectively replace a whole set of individual neural classifiers, each providing (in principle) the best...
The publication of full likelihood functions (LFs) of LHC results is vital for a long-lasting and profitable legacy of the LHC. Although major steps have been put forward in this direction, the systematic publication of LFs remains a big challenge in High Energy Physics (HEP) as such distributions are usually quite complex and high-dimensional. Thus, we propose to describe LFs with Normalizing...
We present a novel computational approach for extracting weak signals, whose exact location and width may be unknown, from complex background distributions with an arbitrary functional form. We focus on datasets that can be naturally presented as binned integer counts, demonstrating our approach on the datasets from the Large Hadron Collider. Our approach is based on Gaussian Process (GP)...
We present an end-to-end reconstruction algorithm to build particle candidates from detector hits in next-generation granular calorimeters similar to that foreseen for the high-luminosity upgrade of the CMS detector. The algorithm exploits a distance-weighted graph neural network, trained with object condensation, a graph segmentation technique. Through a single-shot approach, the...
CLUE (CLUsters of Energy) is a fast, fully-parallelizable clustering algorithm developed to optimize such a crucial step in the event reconstruction chain of future high granularity calorimeters. The main drawback of having an unprecedentedly high segmentation in this kind of detectors is a huge computation load that, in case of the CMS, must be reduced to fit the harsh requirements of the...
We propose a novel neural architecture that enforces an upper bound on the Lipschitz constant of the neural network (by constraining the norm of its gradient with respect to the inputs). This architecture was useful in developing new algorithms for the LHCb trigger which have robustness guarantees as well as powerful inductive biases leveraging the neural network’s ability to be monotonic in...
Jet tagging is a critical yet challenging classification task in particle physics. While deep learning has transformed jet tagging and significantly improved performance, the lack of a large-scale public dataset impedes further enhancement. In this work, we present JetClass, a new comprehensive dataset for jet tagging. The JetClass dataset consists of 100 M jets, about two orders of magnitude...
*Besides modern architectures designed via geometric deep learning achieving high accuracies via Lorentz group invariance, this process involves high amounts of computation. Moreover, the framework is restricted to a particular classification scheme and lacks interpretability.
To tackle this issue, we present BIP, an efficient and computationally cheap framework to build rotational,...
Track fitting and track hit classification are highly relevant, hence these two approaches could benefit each other. For example, if we know the underlying parameters of a track, then track hits associated with the track can be easily identified. On the other hand, if we know the hits of a track, then we can get underlying parameters by fitting them. Most existing works take the second scheme...
Graph Neural Networks (GNN) have recently attained competitive particle track reconstruction performance compared to traditional approaches such as combinatorial Kalman filters. In this work, we implement a version of Hierarchical Graph Neural Networks (HGNN) for track reconstruction, which creates the hierarchy dynamically. The HGNN creates “supernodes” by pooling nodes into clusters, and...
Particle track reconstruction poses a key computing challenge for future collider experiments. Quantum computing carries the potential for exponential speedups and the rapid progress in quantum hardware might make it possible to address the problem of particle tracking in the near future. The solution of the tracking problem can be encoded in the ground state of a Quadratic Unconstrained...
As part of the Run 3 upgrade, the LHCb experiment has switched to a two stage event trigger, fully implemented in software. The first stage of this trigger, running in real time at the collision rate of 30MHz, is entirely implemented on commercial off-the-shelf GPUs and performs a partial reconstruction of the events.
We developed a novel strategy for this reconstruction, starting with two...
The use of hardware acceleration, particularly of GPGPUs is one promising strategy for coping with the computing demands in the upcoming high luminosity era of the LHC and beyond. Track reconstruction, in particular, suffers from exploding combinatorics and thus could greatly profit from the massively parallel nature of GPGPUs and other accelerators. However, classical pattern recognition...
The continuous growth in model complexity in high-energy physics (HEP) collider experiments demands increasingly time-consuming model fits. We show first results on the application of conditional invertible networks (cINNs) to this challenge. Specifically, we construct and train a cINN to learn the mapping from signal strength modifiers to observables and its inverse. The resulting network...
Constraining cosmological parameters, such as the amount of dark matter and dark energy, to high precision requires very large quantities of data. Modern survey experiments like DES, LSST, and JWST, are acquiring these data sets. However, the volumes and complexities of these data – variety, systematics, etc. – show that traditional analysis methods are insufficient to exhaust the information...
The IRIS-HEP Analysis Grand Challenge (AGC) is designed to be a realistic environment for investigating how analysis methods scale to the demands of the HL-LHC. The analysis task is based on publicly available Open Data and allows for comparing usability and performance of different approaches and implementations. It includes all relevant workflow aspects from data delivery to statistical...
The Federation is a new machine learning technique for handling large amounts of data in a typical high-energy physics analysis. It utilizes Uniform Manifold Approximation and Projection (UMAP) to create an initial low-dimensional representation of a given data set, which is clustered by using Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). These clusters...
Searches for new physics set exclusion limits in parameter spaces of typically up to 2 dimensions. However, the relevant theory parameter space is usually of a higher dimension but only a subspace is covered due to the computing time requirements of signal process simulations. An Active Learning approach is presented to address this limitation. Compared to the usual grid sampling, it reduces...
The Hubble Tension presents a crisis for the canonical LCDM model of modern cosmology: it may originate in systematics in data processing pipelines or it may come from new physics related to dark matter and dark energy. The aforementioned crisis can be addressed by studies of time-delayed light curves of gravitationally lensed quasars, which have the capacity to constrain the Hubble constant...
PAUS is a 40 narrow-band imaging survey using the PAUCam instrument installed at
the William Herschel Telescope (WHT). Since the survey started in 2015, this
instrument has acquired a unique dataset, performing a relatively deep and
wide survey, but with a simultaneous excelled redshift accuracy. The survey
is a compromise in performance between deep spectroscopic survey and wide
field...
RooFit is a toolkit for statistical modeling and fitting used by most experiments in particle physics. Just as data sets from next-generation experiments grow, processing requirements for physics analysis become more computationally demanding, necessitating performance optimizations for RooFit. One possibility to speed-up minimization and add stability is the use of automatic differentiation...
The growing amount of data generated by the LHC requires a shift in how HEP analysis tasks are approached. Usually, the workflow involves opening a dataset, selecting events, and computing relevant physics quantities to aggregate into histograms and summary statistics. The required processing power is often so high that the work needs to be distributed over multiple cores and multiple nodes....
The Jiangmen Underground Neutrino Observation (JUNO) experiment is designed to measure the neutrino mass order (NMO) using a 20-kton liquid scintillator detector to solve one of the biggest remaining puzzles in neutrino physics. Regarding the sensitivity of JUNO’s NMO measurement, besides the precise measurement of reactor neutrinos, the independent measurement of the atmospheric neutrino...
The sPHENIX experiment at RHIC requires substantial computing power for its complex reconstruction algorithms. One class of these algorithms is tasked with processing signals collected from the sPHENIX calorimeter subsystems, in order to extract signal features such as the amplitude, timing of the peak and the pedestal. These values, calculated for each channel, form the basis of event...
Nowadays, medical images play a mainstay role in medical diagnosis, and computer tomography, nuclear magnetic resonance, ultrasound and other imaging technologies have become a powerful means of in vitro imaging. Extracting lesion information from these images can enable doctors to observe and diagnose the lesion more effectively, so as to improve the accuracy of quasi diagnosis. Therefore,...
Earth Observation (EO) has experienced promising progress in the modern era via an impressive amount of research on establishing a state-of-the-art Machine Learning (ML) technique to learn a large dataset. Meanwhile, the scientific community has also extended the boundary of ML to the quantum system and exploited a new research area, so-called Quantum Machine Learning (QML), to integrate...
In high energy physics experiments, the calorimeter is a key detector measuring the energy of particles. These particles interact with the material of the calorimeter, creating cascades of secondary particles, the so-called showers. Describing development of cascades of particles relies on precise simulation methods, which is inherently slow and constitutes a challenge for HEP experiments....
The prospect of possibly exponential speed-up of quantum computing compared to classical computing marks it as a promising method when searching for alternative future High Energy Physics (HEP) simulation approaches. HEP simulations like at the LHC at CERN are extraordinarily complex and, therefore, require immense amounts of computing hardware resources and computing time. For some HEP...
The Beijing Spectrometer III (BESIII) [1] is a particle physics experiment at the Beijing Electron–Positron Collider II (BEPC II) [2] which aims to study physics in the tau-charm region precisely. Currently, the BESIII has collected an unprecedented number of data and the statistical uncertainty is reduced significantly. Therefore, systematic uncertainty is key for getting more precise...
In contemporary high energy physics (HEP) experiments the analysis of vast amounts of data represents a major challenge. In order to overcome this challenge various machine learning (ML) methods are employed. However, in addition to the choice of the ML algorithm a multitude of algorithm-specific parameters, referred to as hyperparameters, need to be specified in practical applications of ML...
Deep Learning algorithms are widely used among the experimental high energy physics communities and have proved to be extremely useful in addressing a variety of tasks. One field of application for which Deep Neural Networks can give a significant improvement is event selection at trigger level in collider experiments. In particular, trigger systems benefit from the implementation of Deep...
The CMS ECAL has achieved an impressive performance during the LHC Run1 and Run2. In both runs, the ultimate performance has been reached after a lengthy calibration procedure required to correct ageing-induced changes in the response of the channels. The CMS ECAL will continue its operation far beyond the ongoing LHC Run3: its barrel section will be upgraded for the LHC Phase-2 and it will be...
We have developed and implemented a machine learning based system to calibrate and control the GlueX Central Drift Chamber at Jefferson Lab, VA, in near real-time. The system monitors environmental and experimental conditions during data taking and uses those as inputs to a Gaussian process (GP) with learned prior. The GP predicts calibration constants in order to recommend a high voltage (HV)...
The online Data Quality Monitoring (DQM) system of the CMS electromagnetic calorimeter (ECAL) is a vital operations tool that allows ECAL experts to quickly identify, localize, and diagnose a broad range of detector issues that would otherwise hinder physics-quality data taking. Although the existing ECAL DQM system has been continuously updated to respond to new problems, it remains one step...
Cryogenic phonon detectors are used by direct detection dark matter experiments to achieve sensitivity to light dark matter particle interactions. Such detectors consist of a target crystal equipped with a superconducting thermometer. The temperature of the thermometer and the bias current in its readout circuit need careful optimization to achieve optimal sensitivity of the detector. This...
CORSIKA 8 is a Monte Carlo simulation framework to model ultra-high energy secondary particle cascades in astroparticle physics. This presentation is devoted to the advances in the parallelization of CORSIKA 8, which is being developed in modern C++ and is designed to run on multi-thread modern processors and accelerators, are discussed.
Aspects such as out-of-the-order particle shower...
Modern cosmology surveys are producing data at rates that are soon to surpass our capacity for exhaustive analysis – in particular for the case of strong gravitational lenses. While the Dark Energy Survey may discover thousands of galaxy-scale strong lenses, the upcoming Legacy Survey of Space and Time (LSST) will find hundreds of thousands more. These large numbers of objects will make strong...