An IRIS-HEP Blueprint Workshop

Simulation-Based Inference Blueprint

Europe/Berlin
13/2-005 (CERN)

13/2-005

CERN

90
Show room on map
Jay Ajitbhai Sandesara (University of Wisconsin Madison (US)), Nick Smith (Fermi National Accelerator Lab. (US))
Description

The 1st Simulation-Based Inference Blueprint workshop will take place on the 26th and 27th February, 2026 at CERN. The workshop aims to summarize the current status of statistical techniques and software tooling for Simulation-Based Inference (SBI) at the LHC, and to identify both short-term and long-term goals for projects in this area. A key objective is to bring together the LHC community to map out the main stages of an SBI analysis and to establish shared abstractions, requirements, and broadly applicable standards and validations for the statistical model used in SBI across experiments. 

The primary outcome of the workshop will be a blueprint write-up documenting the discussions, capturing the community-agreed inter-experimental standards and conventions, and outlining a roadmap for near-term priorities as well as longer-term strategies for SBI in high-energy physics. We also hope the workshop will facilitate new collaborations in SBI toolkit development by connecting existing efforts and helping coordinate work toward common goals.

Registrations are now open. Note that in-person registrations closes on 15th of February.

This event is being organised by the Institute for Research and Innovation in Software (IRIS-HEP) with support from National Science Foundation Cooperative Agreement OAC-2323298.
Registration
Registration
Participants
  • Thursday 26 February
    • 1
      Welcome and Introduction
      Speakers: Jay Ajitbhai Sandesara (University of Wisconsin Madison (US)), Nick Smith (Fermi National Accelerator Lab. (US))
    • 2
      Simulation-Based Inference — An Introduction
      Speaker: Gilles Louppe
    • 3
      Discussion
    • Simulation-Based Inference at the LHC
    • 12:00
      Lunch break
    • Statistical Modeling (Systematic Uncertainties)
    • 15:00
      Coffee break
    • Neural Networks For Likelihood (Ratio) Estimation
      • 11
        Direct Likelihood Learning and Uncertainty Profiling with Normalizing Flows

        Recent advances in Simulation-Based Inference (SBI) often rely on training classifiers to approximate likelihood ratios. However, direct density estimation using Normalizing Flows offers distinct advantages, particularly in the flexibility of the learned statistical model. In this presentation, we explore the use of Normalizing Flows to learn the likelihood function directly to infer physics parameters of interest. Crucially, we address the integration of systematic uncertainties, which is often the bottleneck in precision analyses. We introduce a Factorizable Normalizing Flow architecture that directly learns the conditional dependence of the data on nuisance parameters. Crucially, this allows us to model multiple systematic effects simultaneously without the combinatorial explosion in training cost, enabling efficient profiling without the need for interpolation between alternative likelihood-ratio models.

        Speaker: Davide Valsecchi (ETH Zurich (CH))
      • 12
        Neural (Quasi-)Probabilistic Likelihood Ratio Estimation
        Speakers: Matthew Drnevich (New York University (US)), Stephen Jiggins (Deutsches Elektronen-Synchrotron (DE))
      • 13
        Discussion
  • Friday 27 February
    • Toolkits for Simulation-Based Inference
      • 14
        Toolkit for Simulation-Based Inference (IRIS-HEP)
        Speaker: Jay Ajitbhai Sandesara (University of Wisconsin Madison (US))
      • 15
        Q&A
      • 16
        NEEDLE: Workflow Orchestration for Large-Scale NSBI Deployment

        Neural Simulation Based Inference is a fast-moving field with many ongoing efforts to share these promising methods with the wider HEP community. Whilst the specific NSBI methods that will eventually find adoption in actual analyses are not yet known, it is clear that most approaches face a set of common challenges. Foremost, the reliance on many large neural networks that train on large datasets to provide bias-free estimations leads to a considerable computational demand compared to traditional binned approaches.
        Firstly, the NEEDLE project aims to provide a flexible framework that takes care of the common boilerplate shared by NSBI tools. This includes standardized model management with PyTorch and Lightning, flexible task-based orchestration with law and experiment tracking. In addition, ready-to-use data ingestion modules written with dask allow for out-of-memory computations for both root and parquet data formats. These modules can be used together in a finished workflow or used individually to accommodate existing code. The goal is to facilitate the deployment of NSBI methods without imposing strong constraints on the specific implementations.
        Second, alongside the orchestration infrastructure, a companion benchmarking library to systematically evaluate generative models on toy datasets is being developed. Its goal is to help establish best-practices for the selection and configuration of density estimators. The investigation of powerful and expressive generative models is motivated by their increasing adoption for NSBI in fields such as cosmology and astro-particle physics.
        In this contribution, we present an overview of how NEEDLE will provide a standard and flexible framework for NSBI deployment on HPC with minimal constraints, together with a toolbox of new generative models.

        Speakers: Kylian Schmidt (KIT - Karlsruhe Institute of Technology (DE)), Levi Evans (Deutsches Elektronen-Synchrotron (DE))
      • 17
        Q&A
      • 18
        SBI Toolkit Modification and robust Uncertainty Quantification

        We are exploring the application of the IRIS-HEP Simulation-Based Inference (SBI) toolkit to precision Higgs measurements. In this talk, we discuss methodological developments aimed at improving robustness and of SBI workflows in realistic LHC settings.

        On the tooling side, we explore physics-informed inductive biases in neural architectures, energy-conserving optimization schemes as alternatives to Adam, pre-training strategies for improving computing efficiency, and application of SBI to precision cross-section measurements (e.g. OmniFold). On the modeling side, we discuss strategies to mitigate Monte Carlo statistical uncertainties through the wifi ensembling approach.

        The goal is to identify practical improvements that strengthen SBI applications in precision LHC physics and to foster collaboration between methodological and experimental communities.

        Speaker: Jingjing Pan (KIT - Karlsruhe Institute of Technology (DE))
      • 19
        Broader Discussion
    • 12:00
      Lunch break
    • SBI with semi-parameterized Density Ratios
      • 20
        Building summaries with event2vec

        In this work, we introduce some machine learning techniques for training sensitive vector-representations or vector-summaries of collider events. The vector-summaries of the individual events in a dataset can be directly summed up and analyzed further. For EFT searches, our approach leads to a powerful and convenient middle ground between traditional histogram-based analyses and SBI analyses (based on learned parameterized likelihood ratios). We demonstrate the techniques with some examples and discuss the potential advantages over alternative analysis strategies.

        Speakers: Nick Smith (Fermi National Accelerator Lab. (US)), Prasanth Shyamsundar (Fermi National Accelerator Laboratory)
      • 21
        Parametrized Optimal Observable Approach to NSBI

        The parametrized optimal observable approach is a binned approximation to the full NSBI formalism introduced in [Rep. Prog. Phys. 88 (2025) 067801]. We present the method highlighting it's advantages and limitations. We will show a practical implementation of the parametrized optional observable formalism in RooFit and show how it can be used to construct Asimov datasets and to perform Neyman Construction. We will discuss the challenges faced when trying to implement these ideas with the analysis presented in [Rep. Prog. Phys. 88 (2025) 057803] and ways we found to circumvent them.

        Speaker: Matthew Kenneth Maroun (University of Massachusetts (US))
      • 22
        Tooling issues for binned nSBI analyses with pyhf and how to overcome them

        Neural estimates of likelihood ratios provide a powerful approach to extending sensitivity across wide regions of phase space, but their integration into full HEP analyses presents significant technical challenges. The computational cost of unbinned neural simulation-based inference (nSBI) can be reduced by performing binned fits using optimal observables - whilst still retaining the benefits from a parameterised observable. However, even in for this approach, commonly used statistical tools such as QuickStats and pyhf introduce practical limitations. In this work, we identify the key technical hurdles encountered in binned nSBI analyses and demonstrate solutions that enable robust and fast-turnaround statistical inference.

        Speaker: Malin Elisabeth Horstmann (Technische Universitat Munchen (DE))
      • 23
        Discussion
    • 15:00
      Coffee break
    • 24
      Scaling Neural Simulation-Based Inference at High Performance Computing for LHC analysis

      This talk will present the preliminary work on scaling the Machine Learning training and Hyperparameter optimization (HPO) at High Performance Computers. We leveraged the Pytorch Lightning framework for distributed training and Ray Tune for HPO. In addition, the ML training framework automatically monitors the model’s physics performance by evaluating Neural Simulation Based Inference (NSBI) calibration and closure metrics using goodness-of-fit measures such as $\chi^2$ and Wasserstain distances.
      We plan to add NSBI-oriented functionality to the framework including wifi ensembling, ensembling and inference within the HPO loop. The framework currently focuses on ML training but we plan to add downstream tasks like unbinned fit.

      Speakers: Walter Hopkins (Argonne National Laboratory (US)), Xiangyang Ju (Lawrence Berkeley National Lab. (US))
    • 25
      SBI for Effective Field Theories
      Speaker: Nick Smith (Fermi National Accelerator Lab. (US))
    • 26
      Closeout
      Speakers: Jay Ajitbhai Sandesara (University of Wisconsin Madison (US)), Nick Smith (Fermi National Accelerator Lab. (US))