An IRIS-HEP Blueprint Workshop

Differentiable Analysis Blueprint

Europe/Berlin
4/3-006 - TH Conference Room (CERN)

4/3-006 - TH Conference Room

CERN

110
Show room on map
Lino Oscar Gerlach (Princeton University (US)), Mohamed Aly (Princeton University (US))
Description

This IRIS-HEP Blueprint workshop brings together the HEP community to review the current use of automatic differentiation (AD) in analysis workflows, identify concrete use cases, and discuss future directions for its development and adoption. The scope extends beyond machine learning models to include full analysis pipelines, statistical inference, systematic uncertainty treatment, and experiment-scale software frameworks.

The agenda combines short overview talks with focused discussion sessions to examine the structure of differentiable analysis workflows, highlight conceptual and technical challenges, and identify areas where common abstractions, interfaces, and standards would be beneficial across experiments.

The outcome of the workshop will be a blueprint document summarizing the discussions and outlining near-term priorities and longer-term directions for automatic differentiation in HEP, with the aim of improving coordination between existing efforts and supporting realistic use in LHC analyses.

This event is being organised by the Institute for Research and Innovation in Software (IRIS-HEP) with support from National Science Foundation Cooperative Agreement OAC-2323298.
Registration
Registration
Participants
Zoom Meeting ID
62846891058
Host
Mohamed Aly
Useful links
Join via phone
Zoom URL
  • Thursday 5 March
    • 1
      Welcome & Introduction to Workshop
      Speakers: Lino Oscar Gerlach (Princeton University (US)), Mohamed Aly (Princeton University (US))
    • Foundations and Scope: Talks
      • 2
        Automatic Differentiation in HEP and Differentiable Systematics [30' + 15']
        Speaker: Lukas Alexander Heinrich (Technische Universitat Munchen (DE))
      • 3
        Differentiating Discrete Operations [10' + 5']

        Discrete decisions arise in simulators and analysis pipelines across disciplines such as biophysics, robotics, and HEP. Because these operations are inherently non-differentiable, the machine learning community has developed a range of methods to obtain such gradients. In this talk, I outline why a statistical perspective on gradient estimation is essential in this setting and give a brief overview of existing approaches for handling discrete decisions.

        Speaker: Annalena Kofler (Technical University Munich)
    • Foundations and Scope: Discussion
      • 4
        Discussion
    • 15:50
      Break
    • Conceptual Challenges: Talks
      • 5
        Everything that can go wrong [10'+5']

        My journey through pitfalls I discovered building an autodiff workflow for an ATLAS search.

        Speaker: Frederic Renner (Deutsches Elektronen-Synchrotron (DE))
      • 6
        Adoption of Automatic Differentiation in HEP [10' + 5']

        How and where can AD help in HEP? What do we need to take advantage of it, how does it compare to approaches we used in previous decades? This talk will present a brief look into these questions.

        Speaker: Alexander Held (University of Wisconsin Madison (US))
    • Conceptual Challenges: Discussion
      • 7
        Discussion
    • 17:20
      Break
    • Technical Landscape: Talks
      • 8
        Tooling for Differentiable HEP [10' + 5']
        Speaker: Lino Oscar Gerlach (Princeton University (US))
      • 9
        Differentiable Binning Optimization [10' + 5']

        Categorizing events using discriminant observables is central to many high-energy physics analyses. Yet, bin boundaries are often chosen by hand. A simple, popular choice is to apply argmax projections of multi-class scores and equidistant binning of one-dimensional discriminants.

        This talk presents binning optimization for signal significance directly in multi-dimensional discriminants using a differentiable approach. We use a Gaussian Mixture Model to define flexible bin boundary shapes for multi-class scores, while in one dimension (binary classification), we move bin boundaries directly. The performance is evaluated on a toy binary classification example and on a three-class problem with two signal processes and one background.

        Speaker: Nitish Kasaraguppe (RWTH Aachen (DE))
      • 10
        Awkward Arrays and JAX [10' + 5']

        High-energy physics (HEP) relies on nested, variable-length (“ragged”) data structures that do not align naturally with the static, rectangular tensor abstractions assumed by most GPU compiler stacks. While Awkward Array provides a NumPy-like interface for such data, integrating it into the JAX ecosystem exposes an architectural mismatch between dynamic, offset-based structures and JAX’s static-shape XLA compilation model.

        This presentation evaluates the current state of the Awkward–JAX backend and examines why JAX’s tracing model remains a significant hurdle. The cumulative overhead of tracing, PyTree transformations, compilation latency, and kernel dispatch—the effective “XLA tax”—often negates expected GPU speedups for realistic jagged workloads.

        The talk revisits an alternative autodiff architecture based on eager, complex-step differentiation. By leveraging Awkward’s complex-valued kernels and avoiding external tracing systems, this approach could provide near machine-precision forward-mode derivatives that are Numba-compatible and independent of static shape constraints. The central question is whether such an internal, eager approach is a viable long-term path for autodiff in HEP, or whether the architectural mismatch with JAX represents a fundamental barrier for dynamic data.

        Speaker: Ianna Osborne (Princeton University)
    • Technical Landscape: Discussion
      • 11
        Discussion
    • 12
      Closing and Summary
      Speakers: Lino Oscar Gerlach (Princeton University (US)), Mohamed Aly (Princeton University (US))
    • 13
      Introduction to the day
      Speakers: Lino Oscar Gerlach (Princeton University (US)), Mohamed Aly (Princeton University (US))
    • Technical Landscape: Talks II
      • 14
        Differentiable Statistical Ecosystem with JAX [10' + 5']

        The propagation of gradients with backpropagation through a HEP analysis for end-to-end optimization begins at the last step of a physics analysis: the statistical measurement. Therefore, it is crucial to have statistical tools that are fully differentiable in order to calculate gradients with respect to the final physical measurement. This contribution provides an overview of how such fully differentiable statistical tools can be achieved with the JAX ecosystem.

        Speaker: Peter Fackeldey (Princeton University (US))
      • 15
        Differentiable Programming in C++ and ROOT with Clad [25' + 5']

        In this presentation, we make the case for differential programming in C++ for High Energy Physics. We will first introduce source code-transformation-based Automatic Differentiation (AD) with Clad, a Clang compiler plugin. Some success stories of how this is used for statistical analysis in ROOT are presented, including using Clad to differentiate through statistical likelihoods in RooFit and neural network inference with TMVA SOFIE. Finally, we are reporting on some toy studies on end-to-end differentiable analysis pipelines and how such studies could guide differentiable algorithm development and help identify the “killer app” for differentiable programming in HEP.

        Speakers: Jonas Rembser (CERN), Vassil Vasilev (Princeton University (US))
    • Technical Landscape: Discussion
      • 16
        Discussion
    • 15:25
      Break
    • Beyond LHC: Talks
      • 17
        Differentiable Detector Simulation with GEANT [15' + 5']

        Applying automatic differentiation (AD) to particle simulations such as Geant4 opens the possibility of gradient-based optimization for detector design and parameter tuning in high-energy physics. We extend our previous work on differentiable Geant simulations by incorporating multiple Coulomb scattering into the physics model, moving closer to realistic detector modeling. The inclusion of multiple scattering introduces substantial challenges for differentiation, due to increased stochasticity. We study these effects in detail and demonstrate stable derivatives of a Geant simulation with full EM physics, performing gradient-based optimization of a realistic sampling calorimeter. In this talk, we will highlight results so far in differentiable Geant, discuss lessons learned, and provide an outlook for further studies.

        Speaker: Jeffrey Krupa (SLAC)
      • 18
        End-to-End Differentiable Analysis in IceCube [20' + 10']

        Measurements of the astrophysical neutrino flux with the IceCube Neutrino Observatory traditionally rely on binned forward-folding likelihood analyses. These methods require Monte Carlo simulations to predict event distributions. Limited Monte Carlo statistics restrict the dimensionality of the binning and therefore the amount of exploitable information.
        This talk presents a fully differentiable analysis framework that enables end-to-end optimization of summary statistics, combining arbitrarily many input variables to improve the sensitivity of the analysis. As a demonstration, the method is applied to the measurement of the neutrino flux from the Galactic Plane.

        Speaker: Oliver Janik (FAU Erlangen-Nürnberg (DE))
      • 19
        Automatic Differentiation Beyond HEP [20' + 10']

        Differentiable programming is advancing scientific computing by enabling gradients to flow through complex numerical models. In spaceflight mechanics, a field governed by nonlinear dynamics, uncertainty, and strict operational constraints, this approach opens new avenues for optimization, state estimation, uncertainty quantification, and decision-making.
        In this talk, I will present our recent research applying differentiable programming to astrodynamics. We combine low- and high-order automatic differentiation (AD) across multiple contexts: from physics-based modelling and continuous refinements using NeuralODEs, to propagating uncertainties via truncated Taylor polynomials. Low-order AD computes gradients efficiently for machine learning tasks, physics-based modelling, and NeuralODE refinements. High-order derivatives, obtained via variational equations, provide coefficients for state transition tensors (STTs) and event transition tensors (ETTs), enabling accurate representation of solution flows and events. These high-order tools allow non-Gaussian uncertainty propagation and analytical approximations of high-order statistical moments with orders-of-magnitude fewer computations than traditional Monte Carlo simulations.

        I will illustrate these techniques with applications in spaceflight mechanics: low-order AD for thermosphere density modelling, irregular silhouettes modelling and differentiable orbit propagators, as well as high-order AD for uncertainty quantification in mission analysis and guidance, navigation, and control. These approaches demonstrate the potential of differentiable programming for complex, high-dimensional physical systems.

        Speaker: Giacomo Acciarini (European Space Agency (ESA))
    • Beyond LHC: Discussion
    • 20
      Closing, Call for Action and Future Steps
      Speakers: Lino Oscar Gerlach (Princeton University (US)), Mohamed Aly (Princeton University (US))