France-Berkeley PHYSTAT Conference on Unfolding

Anja Butter (Centre National de la Recherche Scientifique (FR)), Ben Nachman (Lawrence Berkeley National Lab. (US)), Lydia Brenner (Nikhef National institute for subatomic physics (NL))

A central task in differential cross section measurements in particle-, nuclear-, and astrophysics is unfolding: the removal of detector distortions, also called deblurring or deconvolution.  Unfolding is a challenging inverse, simulation-based inference task.  

The goal of this conference is to bring together method developers and practioners to discuss the state-of-the-art in unfolding.  One key aspect of the conference will be machine learning-based unfolding methods, which have enabled new possibilities (e.g. unbinned and high-dimensional measurements).

The conference will be held at the LPNHE in Paris from June 10 - 13, 2024. Please note that you have to access the campus at the exit of the metro station Jussieu.

Amphithéâtre Georges Charpak · 75005 Paris, France

There will be a zoom connection for remote participation as well.

Organizing Committee:

Olaf Behnke
Lydia Brenner
Anja Butter
Louis Lyons
Bogdan Malaescu
Ben Nachman

Acknowledgements: We are grateful to the France-Berkeley Fund for sponsorship and to PHYSTAT for logitistcal support.


France-Berkeley PHYSTAT Conference on Unfolding
Zoom Meeting ID
Anja Butter
Useful links
Join via phone
Zoom URL
Participation on-line (no fee)
Registration for in-person participation (€100 fee)
  • Alessandro Tarabini
  • Alex Sopio
  • Andrea Bulla
  • Andrea Giammanco
  • Andres Daniel Perez
  • Andrew Fowlie
  • Anja Butter
  • Axel Niclot
  • Baptiste Ravina
  • Ben Nachman
  • Bertrand Laforge
  • Biao Wang
  • Bogdan Malaescu
  • Caio Cesar Daumann
  • Carlos Mana
  • Carsten Burgard
  • Daniel Britzger
  • David Kavtaradze
  • David Walter
  • Dimitri Bourilkov
  • Elzbieta Richter-Was
  • Fernando Torales Acosta
  • Gianluca Bianco
  • Gulshan Kumar
  • Humberto Reyes-Gonzalez
  • Igor Volobouev
  • Jad Mathieu Sardain
  • Javier Mariño Villadamigo
  • Jona Ackerschott
  • Judita Mamuzic
  • Kevin Thomas Greif
  • Krish Desai
  • Kyle Cormier
  • Laboni Das
  • Laura Brittany Havener
  • Laurent Lellouch
  • Louis Lyons
  • Louis Lyons
  • Lucas Kang
  • Lydia Brenner
  • László Balázs
  • Maja Mackowiak-Pawlowska
  • Marcelo Gameiro Munhoz
  • Maren Stratmann
  • Matteo Defranchis
  • Mayda Velasco
  • Michael Dolce
  • Michael Schmelling
  • Mikael Kuusela
  • Molly Park
  • Nan Lu
  • Nathan Hütsch
  • Nicodemos Andreou
  • Nikolay Gagunashvili
  • Olaf Behnke
  • Olaf Behnke
  • Oleksandr Zenaiev
  • Philippe Gras
  • Rafal Maselek
  • Rahul Balasubramanian
  • Ricardo Barrué
  • Sarah Williams
  • Shilpi Jain
  • Shilpi Jain
  • Sima Bashiri Kahjoq
  • Simone Gasperini
  • Sneh Shuchi
  • Soumyadip Barman
  • Stefan Katsarov
  • Stefan Katsarov
  • Syed Anwar Ul Hasan
  • Tilman Plehn
  • Tim Adye
  • Tom Cavaliere
  • Toni Mlinarevic
  • Yingjie Wei
  • +30
    • 7
      Binned ML methods overview (20'+20')
      Speaker: Jingjing Pan (Yale University (US))
    • 8
      Performance / benchmarking with regularisation choice (20'+20')
      Speaker: Lydia Brenner (Nikhef National institute for subatomic physics (NL))
    • 10:20 AM
    • 9
      Simplified Template Cross Sections (STXS) (20+20)
      Speaker: Rahul Balasubramanian (Centre National de la Recherche Scientifique (FR))
    • 10
      Likelihood-based unfolding with the CMS Higgs combination tool (20+20)

      The CMS Higgs combination tool is the software package used for statistical analyses by the CMS Collaboration. The package, originally designed to perform searches for a Higgs boson and the combined analysis of those searches, has evolved to become the statistical analysis tool presently used in the majority of measurements and searches performed by the CMS Collaboration. Since Combine has access to the full likelihood function, it can also be used to perform a likelihood-based unfolding. This approach has become the standard unfolding procedure for many analyses within the collaboration.

      Speaker: Alessandro Tarabini (ETH Zurich (CH))
    • 12:10 PM
    • 11
      Unfolding is not unsmearing (20+20)

      In particle physics unfolding methods are employed when the basis used to represent an estimate of the truth is not the basis with statistically independent expansion coefficients. For the discrete unfolding problem the latter is given by the eigenvectors of the Fisher information matrix, which measures the amount of information carried by the data about the truth. In typical cases it is ill-conditioned, with the consequence that the measurements constrain only a small number of the expansion coefficients. This allows for highly efficient data reduction, but only for a biased estimate of the truth. Unfolding methods differ in how they bias the result.A way to quantify this is the posterior response matrix.

      Speaker: Michael Schmelling (Max Planck Society (DE))
    • 12
      Unfolding in direct BSM search-sensitive regions of phase space (20'+20')
      Speaker: Sarah Louise Williams (University of Cambridge (GB))
    • 2:50 PM
    • 13
      Using unfolded data in global QCD analyses (20+20)
      Speaker: Oleksandr Zenaiev (Hamburg University)
    • 14
      Unfolding in the context of a heavy ion analysis (20+20)

      In this study, the process of unfolding is studied in the context of a heavy ion photon-tagged jet analysis. The SVD and D'Agostini unfolding algorithms are compared, and an application of using the MSE to choose the regularization strength is shown. Additionally, the investigation looks into the bias associated with unfolding in relation to prior choice. The performance is evaluated with different theoretical models and the bottom line test.

      Speaker: Molly Park (Massachusetts Inst. of Technology (US))
    • 15
      Response Matrix Estimation in Unfolding Differential Cross Sections (20+20)

      In unfolding problem, the response matrix is the forward operator which models the detector response. In practice, the response matrix is not known analytically. Instead, it needs to be estimated using Monte Carlo simulation, which introduces statistical uncertainty into the unfolding procedure. This raises the question of how to estimate the response matrix in a sensible way. In most analyses at the LHC, this is done by binning the events and counting the corresponding numbers of events from bins to bins. However, this approach can suffer from undersmoothing, especially with a small sample size. To address this issue, we propose a two-step approach to response matrix estimation. First, we estimate the response kernel on the unbinned space. Second, we propagate the estimated response kernel into an integral equation to obtain an estimate for the response matrix.

      Speaker: Richard Zhu (Carnegie Mellon University)
    • 16
      Dealing with Uncertainties (20+20)
      Speaker: Kyle Cormier (University of Zurich (CH))
    • 17
      Unfolding in the context of g-2 (20+20)
      Speaker: Laurent Lellouch (CNRS and Aix-Marseille U.)
    • 10:20 AM
    • 18
      Profile likelihood unfolding with large number of bins (20+20)

      In this talk we will discuss previous measurements using binned maximum likelihood unfolding focusing on analyses with large numbers of bins and nuisance parameters. We will outline the technical implementation and highlight the challenges and limitations. Furthermore, we showcase the newly-developed approach of "linearized binned likelihood unfolding",
      a modified formalism that has a better scaling and allows to perform unfolding on even larger numbers of bins and nuisance parameters.

      Speaker: David Walter (CERN)
    • 19
      Moment Unfolding (20+20)
      Speaker: Krish Desai
    • 12:10 PM
    • Tutorials
      Conveners: Dr Carsten Burgard (Technische Universitaet Dortmund (DE)), Fernando Torales Acosta, Javier Mariño Villadamigo, Lydia Brenner (Nikhef National institute for subatomic physics (NL)), Nathan Hutsch, Vincent Alexander Croft (Nikhef National institute for subatomic physics (NL))
    • 6:50 PM
      Conference dinner - Ciel de Paris

      Restaurant Le Ciel de Paris
      Tour Maine Montparnasse
      56 ème étage
      33, avenue du Maine
      75015 Paris

      Accès restaurant par l'ascenseur "Le Ciel de Paris"

    • 20
      QUnfold: Quantum Annealing for Distributions Unfolding in High-Energy Physics (20+20)

      In High-Energy Physics (HEP) experiments, each measurement apparatus exhibit a unique signature in terms of detection efficiency, resolution, and geometric acceptance. The overall effect is that the distribution of each observable measured in a given physical process could be smeared and biased. Unfolding is the statistical technique employed to correct for this distortion and restore the original distribution. This process is essential to make effective comparisons between the outcomes obtained from different experiments and the theoretical predictions.
      The emerging technology of Quantum Computing represents an enticing opportunity to enhance the unfolding performance and potentially yield more accurate results.
      This work introduces QUnfold, a simple Python module designed to address the unfolding challenge by harnessing the capabilities of quantum annealing. In particular, the regularized log-likelihood minimization formulation of the unfolding problem is translated to a Quantum Unconstrained Binary Optimization (QUBO) problem, solvable by using quantum annealing systems. The algorithm is validated on a simulated sample of particles collisions data generated combining the Madgraph Monte Carlo event generator and the Delphes simulation software to model the detector response. A variety of fundamental kinematic distributions are unfolded and the results are compared with conventional unfolding algorithms commonly adopted in precision measurements at the Large Hadron Collider (LHC) at CERN.
      The implementation of the quantum unfolding model relies on the D-Wave Ocean software and the algorithm is run by heuristic classical solvers as well as the physical D-Wave Advantage quantum annealer boasting 5000+ qubits.

      Speakers: Dr Gianluca Bianco (Universita e INFN, Bologna (IT)), Simone Gasperini (Universita e INFN, Bologna (IT))
    • 21
      Unfolding using Denoising Diffusion (20+20)

      Unfolding detector distortions in experimental data is critical for enabling precision measurements in high-energy physics (HEP). However, traditional unfolding methods face challenges in scalability, flexibility, and dependence on simulations. We introduce a novel unfolding approach using conditional denoising diffusion probabilistic models (cDDPM). By modeling the conditional probability density between detector-level observations and truth-level particle properties from various physics processes, the cDDPM unfolding performance generalizes across varied simulated processes and kinematic distributions without retraining. We demonstrate proof-of-concept on toy models and evaluate on simulated Large Hadron Collider jets across different physics processes.

      Speaker: Camila Pazos (Tufts University (US))
    • 10:50 AM
    • 22
      Full Event Particle-Level Unfolding with Variable Length Latent Variational Diffusion (20+20)

      Collisions at the Large Hadron Collider (LHC) provide information about the values of parameters in theories of fundamental physics. Extracting measurements of these parameters requires accounting for effects introduced by the particle detector used to observe the collisions. The typical approach is to use a high-fidelity simulation of the detector to generate synthetic datasets that can then be compared directly with experimental data. However, these simulations are often proprietary and computationally expensive. An alternative approach, unfolding, statistically adjusts the experimental data for detector effects. Traditional unfolding algorithms require binning data in a small set of pre-selected dimensions. Recent methods using generative machine learning models have shown promise for performing un-binned unfolding in high dimensions, allowing later computation of many observables. However, all current generative approaches are limited to unfolding a fixed set of observables, making them unable to perform full-event unfolding in the variable dimensional environment of collider data. A novel modification to the variational latent diffusion model (VLD) approach to generative unfolding is presented, which allows for unfolding of high- and variable-dimensional feature spaces. The performance of this method is evaluated in the context of semi-leptonic $t\bar{t}$ production at the LHC. Additionally, the dependence of the unfolding on the training data prior is assessed by evaluating the model on datasets with alternative priors.

      Speakers: Alexander Shmakov (University of California Irvine (US)), Kevin Thomas Greif (University of California Irvine (US))
    • 12:00 PM
    • 23
      A 24-dimensional Cross-section Measurement with ATLAS (15+15)
      Speaker: Mariel Pettee (Lawrence Berkeley National Lab. (US))
    • 24
      Multidimensional cross-section measurements with H1 (15+15)
      Speaker: Fernando Torales Acosta (Lawrence Berkeley National Lab. (US))
    • 25
      Open discussion on challenges and possible solutions in unfolding
      Speaker: Bogdan Malaescu (LPNHE-Paris CNRS/IN2P3 (FR))
    • 26
      Closing of the meeting and summary
      Speaker: Ben Nachman (Lawrence Berkeley National Lab. (US))