- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
MODE (for Machine-learning Optimized Design of Experiments) is a collaboration of physicists and computer scientists who target the use of differentiable programming in design optimization of detectors for particle physics applications, extending from fundamental research at accelerators, in space, and in nuclear physics and neutrino facilities, to industrial applications employing the technology of radiation detection.
Our aim to develop modular, customizable, and scalable, fully differentiable pipelines for the end-to-end optimization of articulated objective functions that model in full the true goals of experimental particle physics endeavours, to ensure optimal detector performance, analysis potential, and cost-effectiveness.
This workshop is part of the activities of the MODE Collaboration and of the SIVERT project.
Location:
The 2023 workshop will take place at Princeton University in Princeton, NJ, USA, following the 2nd workshop in Crete in 2022 and 1st workshop in Louvain-La-Neuve in 2021.
Organizing Committee:
Scientific Advisory Committee:
Princeton Local Organizers
Funding agencies:
This workshop is partially supported by the joint ECFA-NuPECC-APPEC Activities (JENAA).
This workshop is partially supported by National Science Foundation grant OAC-1836650 (IRIS-HEP).
Accurate detector simulations are key components of any measurement or search for new physics. Due to their stochastic nature, ML-based generative models are natural opportunities for fast, differentiable simulations. We present two such graph- and attention-based models for generating LHC-like data using sparse and efficient point cloud representations, with state-of-the-art results. We measure a three-orders-of-magnitude improvement in latency compared to LHC full simulations, and also discuss recent work on evaluation metrics for validating such ML-based fast simulations.
Highly granular pixel detectors allow for increasingly precise measurements of charged particle tracks, both in space and time. A reduction in pixel size by a factor of four in next-generation detectors will lead to unprecedented data rates, exceeding those foreseen at the High Luminosity Large Hadron Collider. Despite this increase in data volume, smart data reduction within the pixelated region of the detector will enable physics information to be extracted from the pixel detector with high efficiency and low latency, and has the potential to provide precise vertex information at the LHC bunch crossing frequency of 40 MHz (Level-1 trigger) for the first time. Using the shape of charge clusters deposited in arrays of small pixels, the physical properties of the traversing particle can be extracted by locally customized neural networks. Data from the sensor will be processed with a custom readout integrated circuit designed on 28 nm CMOS technology capable of operating at 40MHz and in extreme radiation environments. This talk will present a promising co-design strategy for on-chip data reduction that links the development of pixel sensors, ASICs, and algorithms in a fundamental way.
During the past decade the incorporation of machine learning techniques, particularly deep learning, has led to innovation within the field of (astro)particle physics. Given the success of deep learning methods, machine learning applications in direct dark matter experiments have seen increased attention. In this talk, we discuss some of such applications in the current generation of direct dark matter experiments.
Detection of neutrinos at ultra-high energies (UHE, $E >10^{17}$eV) would open a new window to the most violent phenomena in our universe. Radio detection remains the most promising technique at these energies. However, owing to the expected small flux of UHE neutrinos, the detection rate will be small, with just a handful of events per year, even for large future facilities like the IceCube-Gen2 neutrino observatory at the South Pole.
In this contribution, we will discuss how to substantially enhance the science capabilities of UHE neutrino detectors by increasing the detection rate of neutrinos and improving the quality of each detected event, using recent advances in deep learning and differential programming. First, we will present neural networks, replacing the threshold-based trigger foreseen for future detectors, that increase the detection rate of UHE neutrinos by up to a factor of two. Second, we will outline preliminary results toward an end-to-end optimization of the detector layout using differential programming and deep learning. In particular, we will present results for a normalizing-flow-based energy and direction reconstruction which, for the first time, enables event-by-event non-Gaussian uncertainty predictions.
We estimate that these improvements can as much as triple the science potential of the IceCube-Gen2 radio detector.
LISA (Laser Interferometer Space Antenna) is a mission to detect gravitational waves from space, in the low-frequency band, not accessible from the ground but with a very rich science potential. LISA is a large-class mission of the European Space Agency (ESA), with a significant participation of NASA, that was selected on 2017, after the success of the LISA Pathfinder mission, and is expected to be launched around 2034. In this talk, after summarizing the main characteristics of the mission and its science, and will describe some of the most important challenges the mission has to face for successful mission operations and science exploitation.
The computational assessment of a proposed detector design usually involves Monte-Carlo simulations of how particles interact with the detector. For example, the Bergen Proton CT (pCT) collaboration uses the program GATE based on Geant4 for the development of its digital tracking calorimeter. It would be interesting to see a differentiated simulation being used as part of a differentiable assessment pipeline in the context of MODE, instead of simplified simulators or surrogate models. One obstacle on the way to this goal is the technical complexity associated with differentiating big and complicated software projects like Geant4 with classical source-code-based AD tools.
To reduce this complexity, we have built Derivgrind, a novel AD tool applicable to compiled programs. Under a few assumptions on how a program performs real arithmetic, users of Derivgrind only need to edit a small number of lines in its source code, in order to indicate the input and output variables. In this talk, we showcase the application of Derivgrind to a downsized GATE/Geant4 setup adapted from the Bergen pCT collaboration, producing correct forward- and reverse-mode partial derivatives.
Automatic differentiation (AD) is a practical way for computing derivatives of functions that are expressed as programs. AD has been recognized as one of the key pillars of the current machine learning (ML) revolution and has key applications in domains such as finance, computational fluid dynamics, atmospheric sciences, and engineering optimization.
This talk presents a solution for implementing reverse-mode AD in Futhark: a high-level array language aimed at efficient GPU execution, in which programs are written as a nested composition of sequential loops and (explicitly) parallel constructs (e.g., map, reduce, scan, scatter).
In reverse-mode AD the original program is first executed to save all intermediate program values on a tape. The tape is subsequently used by the return sweep that computes in reverse program order the adjoint of each variable---i.e., the derivative of the result with respect to the given variable.
The talk is intended to provide a gentle introduction to AD, and then to highlight the two key ideas that lay the foundation of our solution:
First, parallel constructs can be differentiated at a high level, by specialized rules that are often more efficient in terms of both space and time than approaches that differentiate low level code.
Second, the tape can be eliminated by re-executing the original code of a scope whenever the return sweep enters that scope. This is important because (i) it is challenging to optimize the (spatial) locality of tape accesses in the context of GPU execution, (ii) the re-execution overhead can be optimized by known compiler transformations, e.g., flattening, and (iii) re-execution enables an important space-time trade-off that can be easily exploited by the user.
Finally, we present an experimental evaluation of ten relevant AI benchmarks that demonstrates that our approach is competitive with both state-of-the-art research solutions (Enzyme, Tapenade) and with popular tensor frameworks (PyTorch, JAX) in terms of both sequential-CPU and parallel-GPU execution.
Palmer House - 1 Bayard Ln, Princeton, NJ
Differentiable Programming could open even more doors in HEP analysis and computing to Artificial Intelligence/Machine Learning. Current common uses of AI/ML in HEP are deep learning networks – providing us with sophisticated ways of separating signal from background, classifying physics, etc. This is only one part of a full analysis – normally skims are made to reduce dataset sizes by applying selection cuts, further selection cuts are applied, perhaps new quantities calculated, and all of that is fed to a deep learning network. Only the deep learning network stage is optimized using the AI/ML gradient decent technique. Differentiable programming offers us a way to optimize the full chain, including selection cuts that occur during skimming. This contribution investigates applying selection cuts in front of a simple neural network using differentiable programing techniques to optimize the complete chain on toy data. There are several well-known problems that must be solved – e.g. selection cuts are not differentiable, and the interaction of a selection cut and a network during training is not well understood. This investigation was motived by trying to automate reduced dataset skims and sizes during analysis – HL-LHC analyses have potentially multi-TB dataset sizes and an automated way of reducing those dataset sizes and understanding the tradeoffs would help the analyzer make a judgement between time, resource usages, and physics accuracy. This contribution explores the various techniques to apply a selection cut that are compatible with differentiable programming and how to work around issues when it is bolted onto a neural network.
This work presents several uses of machine learning techniques, ranging from regressional neural networks, convolutional neural networks and generative adversarial neural networks, applied to scattering muography in the context of industrial problems and applications. A review of the different techniques will be given together with some propects on future developments being carried out. A discussion on the advantages and drawbacks of these techniques will also be provided.
The GENETIS project aims to optimize detector designs for science outcomes. The interdisciplinary team brings a particular expertise in radio applications. This student-driven project started in 2018, is so far optimizing antennas for both the Askaryan Radio Array (ARA) and PUEO experiments for the highest number of detections of astrophysical neutrinos via Askaryan radio emission from interactions in Antarctic ice. So far, we have used genetic algorithms for optimizations, which improve designs using principles based on biological evolution. The framework developed by GENETIS is now capable of evolving the designs of other detectors, whether or not they are based in radio techniques. I will introduce the GENETIS project, present first results, and discuss our plans for the future.
The SWGO experiment aims at measuring ultra-high-energy gamma ray
showers through an array of water Cherenkov tanks deployed at high
altitude in the southern emisphere. A measurement of photon flux entails
the separation of hadronic backgrounds and a precise energy and position
reconstruction. In this presentation we propose a method for the optimization of the placement on the ground of the detector tanks, using differentiable programming techniques. A parallel study of tank design integrated in the pipeline will produce an end-to-end model suitable for working out the global configuration guaranteeing the maximum scientific output of the experiment.
In this talk, we present our efforts in supporting Automatic Differentiation (AD) in RooFit, a toolkit for statistical modeling and fitting used by many HEP/NP experiments that is part of ROOT. The new AD backend improves both the performance and numeric stability of likelihood minimizations, for which we will provide several examples in this contribution. Our approach is to extend RooFit with a tool that generates overhead-free C++ code for a full likelihood function built from RooFit functional models. Gradients are then generated using Clad, a compiler-based source-code-transformation AD tool, using this C++ code. After presenting promising results from a proof-of-concept with this pipeline applied to a HistFactory model at the ACAT 2022 conference, we showcased more general benchmarks on the full minimization pipeline at CHEP 2023. In this workshop, we present how AD can be applied to production workflows in the field of HEP/NP. We also demonstrate that AD is the prime choice for workflows with many parameters, yielding lower minimization times and faster overall fit convergence due to lesser fit iterations and improved accuracy of the calculated gradients.
Clad enables automatic differentiation (AD) for C++. It is based on LLVM compiler infrastructure and is a plugin for Clang compiler. Clad is based on source code transformation. Given C++ source code of a mathematical function, it can automatically generate C++ code for computing derivatives of the function. Clad supports a large set of C++ features including control flow statements and function calls. It supports reverse-mode AD (a.k.a backpropagation) as well as forward-mode AD. It also facilitates computation of hessian matrix and jacobian matrix of any arbitrary function.
In this talk we describe the programming model that Clad enables. We explain what are the benefits of using transformation-based automatic differentiation in high-performance static languages such as C++. We show examples of how to use the tool at scale.
Long-lived hadrons have different cross sections with nuclear matter, and they give rise to different reactions when they interact in dense media. Until now, calorimeters have not been designed to try and exploit these differences for particle identification; yet that information would be highly beneficial in detectors at future facilities.
In this presentation we will explore the observable features of showers initiated by different hadron species in a very high-granularity, homogeneous GEANT4-simulated calorimeter, to determine the level at which such particle ID information is extractable.
In this work, we use machine learning to optimize the design of a hadronic calorimeter to be used in the upcoming Electron Ion collider to be built in Long Island’s Brookhaven National Laboratory over the next decade. We use a full GEANT4 simulation of the calorimeter to train surrogate models that are conditional on the set parameters. We use a deep neural network trained to predict the truth energy of incoming particles that is conditioned on longitudinal segmentation of the calorimeter. We can then do gradient optimization to find the optimal set of detector parameters. We show how we can optimize the detector parameters in terms of the predicted, or reconstructed, energy resolution, and compare the performance of our model to standard reconstruction techniques.
Louis A. Simpson International Building - Atrium
The development and application of Muon Tomography techniques requires the production of considerable amounts of simulation data, usually generated with complex and slow particle simulation software. In this talk, we explore the use of Generative Adversarial Network (GAN) as a way of generating simulation data for muon tomography applications in a faster and less computationally expensive way.
A deep neural network for the Muography Observatory System was developed to distinguish high energy muons (> 3 GeV) from low energy ones. A Geant4 based Monte-Carlo simulation was written to provide teaching sample to the neural network. The simulation was validated by measurements taken at the Sakurajima volcano. In order to understand how the neural network works, a game theoretic approach was used to explain the output of our machine learning model. SHAP values made it possible to understand which input parameters are the most important for the network on a case by case basis. The performance of the machine learning algorithm was compared to the classical tracking solution.
The Geant4
particle transport simulation toolkit, widely used in high energy and nuclear physics, biomedical, space science, etc. applications, will be introduced briefly. The main concepts, design choices and interfaces that enable to simulate the passage of different particles through complex geometrical setups, while modelling their interactions with rather diverse characteristics, will be highlighted. The goal of the presentation is to explore the possible strategies for differentiability as well as to establish a direct connection between the Geant4
developers and the differentiable programming experts engaged with the HEP community.
Simulating high-resolution detector responses is a
storage-costly and computationally intensive process that has long
been challenging in particle physics. Despite the ability of deep
generative models to make this process more cost-efficient,
ultra-high-resolution detector simulation still proves to be difficult
as it contains correlated and fine-grained mutual information within
an event. To overcome these limitations, we propose Intra-Event Aware
GAN (IEA-GAN), a novel fusion of Self-Supervised Learning and
Generative Adversarial Networks. IEA-GAN presents a Relational
Reasoning Module that approximates the concept of an ''event'' in
detector simulation, allowing for the generation of correlated
layer-dependent contextualized images for high-resolution detector
responses with a proper relational inductive bias. IEA-GAN also
introduces a new intra-event aware loss and a Uniformity loss,
resulting in significant enhancements to image fidelity and diversity.
We demonstrate IEA-GAN's application in generating sensor-dependent
images for the high-granularity Pixel Vertex Detector (PXD), with more
than 7.5M information channels and a non-trivial geometry, at the
Belle II Experiment. Applications of this work include controllable
simulation-based inference and event generation, high-granularity
detector simulation such as at the HL-LHC (High Luminosity LHC), and
fine-grained density estimation and sampling. To the best of our
knowledge, IEA-GAN is the first algorithm for faithful
ultra-high-resolution detector simulation with event-based reasoning
The Deep Underground Neutrino Experiment (DUNE) will use an intense neutrino beam created in Illinois and sent through the Earth to a large liquid argon detector in South Dakota. The neutrino beam, part of the Long Baseline Neutrino Facility (LBNF), will consist of a 120 GeV proton beam which will impinge a long graphite target. Mesons produced in the target will be focused by three magnetic horns and will decay to neutrinos in a 200 m long decay pipe. The design of the target/horn system was optimized using a genetic algorithm. This optimization will be discussed, as well as other ongoing and future design optimizations within the DUNE collaboration.
Liquid argon time projection chambers (LArTPCs) play a crucial role in the current and coming particle detection experiments, offering exceptional tracking and calorimetric capabilities.
To enhance the accuracy of detector simulations and enable realistic physics analyses, the particle physics community has focused on refining simulators through dedicated calibration measurements. However, the entanglement of various detector modelings poses a challenge to achieving optimal accuracy. In this presentation, a novel approach will be introduced —a LArTPC differentiable simulator focusing— that allows for gradient-based calibration of multiple detector parameters simultaneously. Using this method, the direct extraction of physics information from calibration fits can be achieved, along with a comprehensive consideration of all correlations between parameters that were previously hard to access. While the code is configurable to adapt to multiple geometries, the studies presented here are done using the DUNE ND as a first case study. The process of developing a differentiable simulator through the transformation of a standard simulation tool into a differentiable framework will be presented, discussing its advantages and limitations, and addressing the obstacles encountered when ensuring the preservation of physics quality while extracting meaningful gradient information.