Hammers & Nails 2023 - Swiss Edition

Congressi Stefano Franscini (CSF)

Congressi Stefano Franscini (CSF)

Monte Verità, Ascona, Switzerland

Frontiers in Machine Learning in Cosmology, Astro & Particle Physics

October 29 – November 3, 2023  |  Conference center Congressi Stefano Franscini (CSF) in Monte Verità, Ascona, Switzerland

The Swiss Edition of Hammers & Nails in 2023 is following the success of the 2017, 2019 and 2022 Hammers & Nails workshops at Weizmann Institute of Science, Israel.

Cosmology, astro, and particle physics are constantly pushing forward the boundary of human knowledge further into the previously Unknown, fueled by open mysteries such as dark matter, dark energy or quantum gravity: simulating the Universe, simulating trillions of LHC particle collisions, searching for feeble anomalous signals in a deluge of data or inferring the underlying theory of nature by use of data which has been convolved with complex detector responses.  The machine learning hammer has already proven itself useful to decipher our conversation with the Unknown.

What is holding us back and where is cutting-edge machine learning expected to triumph over conventional methods?  This workshop will be an essential moment to explore open questions, foster new collaborations and shape the direction of ML design and application in these domains.  
An overarching theme is given by unsupervised and generative models which have excelled recently given the success of transformers, diffusion and foundational models. Other success stories include simulation-based inference, optimal transport, active learning and anomaly detection. The community has also taken inspiration from the rise of machine learning in many other domains such as in molecular dynamics.

The trademark of Hammers & Nails is an informal atmosphere with open-ended lectures spanning academia and industry, and a stage for early-career scientists, with time for free discussion and collaboration.

Participation is by invitation. Limited admission through submission of an abstract and a brainstorming idea is available with focus on early-career scientists.

Confirmed invited speakers and panelists:

  • Thea Aarrestad (ETH Zürich)
  • Piotr Bojanowski (Meta AI)
  • Adji Bousso-Dieng (Princeton)
  • Michael Bronstein (University of Oxford | Twitter)
  • Anja Butter (University of Heidelberg | LPNHE)
  • Taco Cohen (Qualcomm AI Research)
  • Kyle Cranmer (University of Wisconsin-Madison)
  • Michael Elad (Technion)
  • Eilam Gross (Weizmann Institute of Science)
  • Atilim Günes Baydin (University of Oxford)
  • Lukas Heinrich (Technical University of Munich)
  • Shirley Ho  (Center for Computational Astrophysics at Flatiron Institute)
  • Michael Kagan (SLAC)
  • Francois Lanusse (CNRS)
  • Ann Lee (Carnegie Mellon University)
  • Laurence Levasseur (University of Montréal | Mila)
  • Qianxiao Li (National University of Singapore)
  • Jakob Macke (Tübingen University)
  • Alexander G. D. G. Matthews (Google Deep Mind)
  • Jennifer Ngadiuba (Fermilab)
  • Kostya Novoselov (University of Singapore | Nobel laureate 
  • Barnabas Poczos (Carnegie Mellon University)
  • Johnny Raine (University of Geneva)
  • Jesse Thaler (MIT)
  • Andrey Ustyuzhanin (Higher School of Economics)


Scientific Organizing Committee:

  • Tobias Golling (University of Geneva)
  • Danilo Rezende (Google Deep Mind)
  • Robert Feldmann (University of Zurich)
  • Slava Voloshynovskiy (University of Geneva)
  • Eilam Gross (Weizmann Institute of Science)
  • Kyle Cranmer (University of Wisconsin-Madison)
  • Ann Lee (Carnegie Mellon University)
  • Maurizio Perini (CERN)
  • Shirley Ho (Center for Computational Astrophysics at Flatiron Institute)
  • Tilman Plehn (University of Heidelberg)
  • Elena Gavagnin (Zurich University of Applied Sciences)
  • Peter Battaglia (Google Deep Mind)

More information to follow soon.


  • Alex Matthews
  • Andrey Ustyuzhanin
  • Anja Butter
  • Barnabas Poczos
  • Eilam Gross
  • Elena Gavagnin
  • François Lanusse
  • Jennifer Ngadiuba
  • Jesse Thaler
  • John Raine
  • Laurence Levasseur
  • Li Qianxiao
  • Louis Lyons
  • Michael Elad
  • Michael Kagan
  • Robert Feldmann
  • Shirley Ho
  • Svyatoslav (Slava) Voloshynovskyy
  • Taco Cohen
  • Thea Aarrestad
  • Tobias Golling
    • 5:30 PM 6:30 PM
      Registration & Reception 1h
    • 7:00 PM 8:00 PM
      Dinner 1h
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 9:30 AM
      Introduction & welcome 30m
    • 9:30 AM 12:15 PM
      Invited speakers
      • 9:30 AM
        Highlights of machine learning in particle physics for computer scientists 1h
        Speaker: Johnny Raine (Universite de Geneve (CH))
      • 10:30 AM
        Coffee break 30m
      • 11:00 AM
        Diversity-Informed Machine Learning And Its Applications in Molecular Modeling 1h 15m
        Speaker: Adji Bousso-Dieng
    • 12:15 PM 2:00 PM
      Lunch break 1h 45m
    • 2:00 PM 3:15 PM
      Invited speakers
      • 2:00 PM
        Geometric Algebra Transformers: A Universal Architecture of Geometric Data 1h 15m
        Speaker: Taco Cohen
    • 3:15 PM 4:00 PM
      Coffee break 45m
    • 4:00 PM 6:15 PM
      Young Scientist Forum
      • 4:00 PM
        How will AI enable autonomous particle accelerators? 15m

        The need for greater flexibility, faster turnaround times, reduced energy consumption, reducing operational cost at maximum physics output and the sheer size of potential future accelerators such as the FCC ask for new particle accelerator operational models with automation at the center. AI/ML is already playing a significant role in the accelerator domain with numerous applications in design, diagnostics and control. This contribution will define the building blocks for autonomous accelerators as had been discussed in the CERN accelerator sector initiative called the Efficiency Think Tank and outline where AI would need to be applied for reaching quasi full automation. Equipment design considerations, control system requirements as well as necessary software frameworks will be summarized. And finally the remaining questions and challenges will be mentioned.

        Speaker: Verena Kain (CERN)
      • 4:15 PM
        End-To-End Latent Variational Diffusion Models for Unfolding LHC Events. 8m

        High-energy collisions at the Large Hadron Collider (LHC) provide valuable insights into open questions in particle physics. However, detector effects must be corrected before measurements can be compared to certain theoretical predictions or measurements from other detectors. Methods to solve this inverse problem of mapping detector observations to theoretical quantities of the underlying collision, referred to as unfolding, are essential parts of many physics analyses at the LHC. We investigate and compare various generative deep learning methods for unfolding at parton level. We introduce a novel unified architecture, termed latent variation diffusion models, which combines the latent learning of cutting-edge generative art approaches with an end-to-end variational framework. We demonstrate the effectiveness of this approach for reconstructing global distributions of theoretical kinematic quantities, as well as for ensuring the adherence of the learned posterior distributions to known physics constraints. Our unified approach improves the reconstruction of parton-level kinematics as measured by several distribution-free metrics.

        Speaker: Alexander Shmakov (University of California Irvine (US))
      • 4:23 PM
        PC-Droid: Jet generation with diffusion 8m

        Building on the success of PC-JeDi we introduce PC-Droid, a substantially improved diffusion model for the generation of jet particle clouds. By leveraging a new diffusion formulation, studying more recent integration solvers, and training on all jet types simultaneously, we are able to achieve state-of-the-art performance for all types of jets across all evaluation metrics. We study the trade-off between generation speed and quality by comparing two attention based architectures, as well as the potential of consistency distillation to reduce the number of diffusion steps. Both the faster architecture and consistency models demonstrate performance surpassing many competing models, with generation time up to two orders of magnitude faster than PC-JeDi and three orders of magnitude faster than Delphes.

        Speaker: Mr Matthew Leigh (University of Geneva)
      • 4:31 PM
        Masked particle modelling 8m

        The Bert pretraining paradigm has proven to be highly effective in many domains including natural language processing, image processing and biology. To apply the Bert paradigm the data needs to be described as a set of tokens, and each token needs to be labelled. To date the Bert paradigm has not been explored in the context of HEP. The samples that form the data used in HEP can be described as a set of particles (tokens) where each particle is represented as a continuous vector. We explore different approaches for discretising/labelling particles such that the Bert pretraining can be performed and demonstrate the utility of the resulting pretrained models on common downstream HEP tasks.

        Speaker: Samuel Byrne Klein (Universite de Geneve (CH))
      • 4:39 PM
        Drapes: Diffusion for weak supervision 8m

        We employ the diffusion framework to generate background enriched templates to be used in a downstream Anomaly Detection task (generally with CWoLa). We show how Drapes encompasses all modes of template generation, common in literature, and show State-of-the-art performance on the public RnD LHCO dataset.

        Speaker: Debajyoti Sengupta (Universite de Geneve (CH))
      • 4:47 PM
        Self-supervised learning of jets using a realistic detector simulation 8m

        Self-supervised learning (SSL) is a technique to obtain descriptive representations of data in a pretext task based on unlabeled input. Despite being well established in fields such as natural language processing and computer vision, SSL applications in high energy physics (HEP) have only just begun to be explored. Further research into SSL in the context of HEP is especially motivated given the potential to leverage enormous datasets collected by LHC experiments for training without labels. We demonstrate an SSL model of jet representations and its ability to express both global information and jet substructure. Furthermore, we investigate how SSL representations derived from low-level detector features can be used to search for exotic or anomalous jets in a largely unsupervised way. Going beyond the few existing studies in this direction, we conduct our studies using a realistic, state-of-the-art calorimeter simulation, such that our results are representative of possible future applications at collider experiments.

        Speakers: Dmitrii Kobylianskii (Weizmann Institute of Science (IL)), Etienne Dreyer (Weizmann Institute of Science (IL)), Nathalie Soybelman (Weizmann Institute of Science (IL)), Nilotpal Kakati (Weizmann Institute of Science (IL)), Patrick Rieck (New York University (US))
      • 4:55 PM
        De-noising Graph Super-Resolution with Diffusion Models and Transformers 8m

        Accurate reconstruction of particles from detector data forms the core problem in experimental particle physics. The spatial resolution of the detector, in particular the calorimeter granularity, is both influential in determining the quality of the reconstruction, and largely sets the upper limit for the algorithm's theoretical capabilities. To address these limitations, super-resolution techniques can offer a promising approach by enhancing low-resolution detector data to achieve higher resolution.

        In addition to image generation, Diffusion models have demonstrated effectiveness in super-resolution tasks. Given its sparsity and non-homogeneity, calorimeter data can be most faithfully represented using graphs. Therefore, this study introduces a novel approach to graph super-resolution using diffusion and a transformer-based de-noising network. This work represents the first instance of applying graph super-resolution with diffusion. The low-resolution image, which corresponds to recorded detector data, is also subject to noise from various sources. As an added benefit, the proposed model aims to remove these noise artifacts, further contributing to improved particle reconstruction.

        Speaker: Nilotpal Kakati (Weizmann Institute of Science (IL))
      • 5:03 PM
        Field-Level Inference with Microcanonical Langevin Monte Carlo 8m

        Extracting optimal information from upcoming cosmological surveys is a pressing task, for which a promising path to success is performing field-level inference with differentiable forward modeling. A key computational challenge in this approach is that it requires sampling a high-dimensional parameter space. In this talk I will present a new promising method to sample such large parameter spaces, which improves upon the traditional Hamiltonian Monte Carlo, to both reconstruct the initial conditions of the Universe and obtain cosmological constraints.
        (Based on https://arxiv.org/abs/2307.09504 and further new results.)

        Speaker: Adrian Bayer (Princeton University / Simons Foundation)
      • 5:11 PM
        Accelerating graph-based tracking with symbolic regression 8m

        In high-energy physics experiments, tracking, the reconstruction of particle trajectories from hits in the inner detector, is a computationally intensive task due to the large combinatorics of detector signals. Recent efforts have proven that ML techniques can be successfully applied to the tracking problem, extending and improving the conventional methods based on feature engineering. However, the inference of complex networks can be too slow to be used in the trigger system. Quantising the network and deploying it on an FPGA is feasible but challenging and highly non-trivial. An efficient alternative can employ symbolic regression (SR), which already proved its performance in replacing a dense neural network for jet classification. We propose a novel approach that uses SR to replace a graph-based neural network. Using a simplified toy-example, we substitute each network block with a symbolic function, preserving the graph structure of the data and enabling message passing. This approach significantly speeds up inference on a CPU without sacrificing much accuracy.

        Speaker: Nathalie Soybelman (Weizmann Institute of Science (IL))
      • 5:19 PM
        Novel Approaches for Fast Simulation in HEP using Diffusion and Graph-to-Graph Translation 8m

        The simulation of particle physics data is a fundamental but computationally intensive ingredient for physics analysis at the Large Hadron Collider. In traditional fast simulation schemes, a surrogate calorimeter model is the basis for a set of reconstructed particles. We demonstrate the feasibility of generating the reconstructed objects in one step, replacing both the calorimeter simulation and reconstruction step. Our previous model that employed slot attention achieved promising results on a simplified synthetic dataset. In this work, we propose two novel approaches to improve this task and evaluate them on a more realistic dataset. In the first approach, we augment the slot-attention mechanism with a state-of-the-art diffusion model, where we start with a noisy graph and perform gradual noise reduction by solving Stochastic Differential Equations with gradient approximation and obtain the reconstructed particles. The second approach incorporates iterative graph refinement, where we directly transform the set of truth particles into the set of reconstructed particles. These approaches are able to go beyond our previous baseline performance in terms of both accuracy and resolution of the predicted particle properties.

        Speaker: Dmitrii Kobylianskii (Weizmann Institute of Science (IL))
      • 5:27 PM
        The MadNIS Reloaded – Boosting MG5aMC with Neural Networks 8m

        Theory predictions for the LHC require precise numerical phase-space integration and generation of unweighted events. We combine machine-learned multi-channel weights with a normalizing flow for importance sampling, to improve classical methods for numerical integration. By integrating buffered training for potentially expensive integrands, VEGAS initialization, symmetry-aware channels, and stratified training, we elevate the performance in both efficiency and accuracy. We empirically validate these enhancements through rigorous tests on diverse LHC processes, including VBS and W+jets.

        Speaker: Theo Heimel (Heidelberg University)
      • 5:35 PM
        Galaxies and Graphs 8m

        Graph Neural Networks are the premier method for learning the physics of a given system, since abstracting physical systems as graphs fits naturally with common descriptions of those systems. I will show how the fundamental processes that shape galaxies and dark matter halos can be learned efficiently by embedding galaxies and halos on either temporal or spatial graphs. Learning the temporal co-evolution of galaxies and their dark matter halos allows us to connect one of the most successful modern astrophysical theories, Lambda-CDM, with the poorly understood processes that shape galaxies, opening new pathways for both understanding, and speeding up simulations by ~6 orders of magnitude. Learning the spatial correlations between galaxies and halos also offers important insights into galaxy evolution, and lends itself more easily to comparisons with observations.

        Since GNNs work well with Symbolic Regression, I will also show how low-dimensional, analytic laws of galaxy formation can be derived from these models.

        Speaker: Christian Kragh Jespersen (Princeton University)
      • 5:43 PM
        Simulation-based Self-supervised Learning (S3L) 8m

        Self-Supervised Learning (SSL) is at the core of training modern large ML models, providing a scheme for learning powerful representations in base models that can be used in a variety of downstream tasks. However, SSL training strategies must be adapted to the type of training data, thus driving the question: what are powerful SSL strategies for collider physics data? In the talk, we present a novel simulation-based SSL (S3L) strategy wherein we develop a method of “re-simulation” to drive data augmentation for contrastive learning. We show how an S3L-trained base model can learn powerful representations that can be used for downstream discrimination tasks and can help mitigate uncertainties.

        Speaker: Benedikt Maier (KIT - Karlsruhe Institute of Technology (DE))
      • 5:51 PM
        ATLAS fast calorimeter simulation; the necessity, success and future possibilities. 8m

        This presentation offers an overview of fast simulation in the ATLAS calorimeter.
        To begin, the demand for detector simulation, and the challenge this creates are given to motivate the need for a faster detector simulation.
        The machine learning tools that are currently used to achieve better computational performance are then described.
        These tools have been very successful, both in terms of computational performance and their accurate mimicry of the full ATLAS simulation.
        Using the validation of the fast simulation in run 2 and the preliminary validation for run 3, this success is demonstrated.
        Finally, there is a discussion of what the future might hold for fast simulation.
        Here we consider how the models used might be upgraded or changed,
        and also what new challenges we might anticipate at higher luminosities and with increasingly complex reconstruction tools.

        Speaker: Henry Day-Hall (Czech Technical University in Prague (CZ))
      • 5:59 PM
        Decorrelation using Optimal Transport 8m

        Novel decorrelation method using Convex Neural Optimal Transport Solvers (Cnots) that is able to decorrelate a continuous feature space against protected attributes with optimal transport. We demonstrate how well it performs in the context of jet classification in high energy physics, where classifier scores are desired to be decorrelated from the mass of a jet.

        Speaker: Malte Algren (Universite de Geneve (CH))
      • 6:07 PM
        The Interplay of Machine Learning–based Resonant Anomaly Detection Methods 8m

        Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal that make use of simulated or detected data in different ways, there has not yet been a study of the methods' complementarity. To this end, we address two questions. First, in the absence of any signal, do different methods pick the same events as signal-like? If not, then we can significantly reduce the false-positive rate by comparing different methods on the same dataset. Second, if there is a signal, are different methods fully correlated? Even if their maximum performance is the same, since we do not know how much signal is present, it may be beneficial to combine approaches. Using the Large Hadron Collider (LHC) Olympics dataset, we provide quantitative answers to these questions. We find that there are significant gains possible by combining multiple methods, which will strengthen the search program at the LHC and beyond.

        Speaker: Radha Mastandrea (University of California, Berkeley)
    • 6:30 PM 8:00 PM
      Young Scientist Forum: Poster session (incl. pizza, beer & insert cards)
    • 8:00 PM 9:00 PM
      Invited speakers
      • 8:00 PM
        Physics-inspired learning on graphs 1h
        Speaker: Michael Bronstein
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 12:00 PM
      Invited speakers
      • 9:00 AM
        AI & material science 1h
        Speaker: Kostya Novoselov
      • 10:00 AM
        Coffee break 30m
      • 10:30 AM
        From AI in material science to HEP and back 45m
        Speaker: Andrey Ustyuzhanin
      • 11:15 AM
        Reduction and Closure of Dynamical Systems using Deep Learning 45m
        Speaker: Qianxiao Li
    • 12:00 PM 1:30 PM
      Lunch 1h 30m
    • 1:30 PM 4:30 PM
      Invited speakers
      • 1:30 PM
        Learning Image Representations Without Manual Annotations and Related Applications 1h 15m
        Speaker: Piotr Bojanowski
      • 2:45 PM
        Coffee break 30m
      • 3:15 PM
        Strong Lensing Data Analysis in the Era of Large Sky Surveys 1h 15m
        Speaker: Laurence Levasseur
    • 4:30 PM 7:00 PM
    • 7:00 PM 8:30 PM
      Dinner 1h 30m
    • 8:30 PM 9:30 PM
      Invited speakers
      • 8:30 PM
        Foundation models for science 1h
        Speaker: Shirley Ho
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 12:00 PM
      Invited speakers
      • 9:00 AM
        Differentiable and Probabilistic Programming in Scientific Simulators 1h 15m
        Speaker: Atılım Güneş Baydin
      • 10:15 AM
        Coffee break 25m
      • 10:40 AM
        The Vision of End-to-End ML models in HEP 40m
        Speaker: Lukas Alexander Heinrich (Technische Universitat Munchen (DE))
      • 11:20 AM
        Toward Building Large HEP Models with Self-Supervised Learning 40m
        Speaker: Michael Kagan (SLAC National Accelerator Laboratory (US))
    • 12:00 PM 1:30 PM
      Lunch break 1h 30m
    • 1:30 PM 2:30 PM
      Industry-academia forum
      • 1:30 PM
        Industry-academia panel 1h
        Speakers: Adji Bousso Dieng, Jennifer Ngadiuba, Jesse Thaler, Michael Bronstein, Taco Cohen
    • 2:30 PM 7:00 PM
      Invited speakers
      • 2:30 PM
        Normalizing flows, diffusion and annealing importance sampling 1h 15m
        Speaker: Alexander Matthews
      • 3:45 PM
        Coffee break 30m
      • 4:15 PM
        Calibrated uncertainty quantification in simulator-based inference 1h 15m
        Speaker: Ann Lee
      • 5:30 PM
        Image Denoising - Not What You Think 1h
        Speaker: Michael Elad
    • 7:00 PM 8:30 PM
      Dinner 1h 30m
    • 8:30 PM 9:30 PM
      Invited speakers
      • 8:30 PM
        Simulation-based inference and the places it takes us 1h
        Speaker: Jakob Macke
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 12:30 PM
      Invited speakers
      • 9:00 AM
        News from generative networks in forward and inverse simulations 1h 15m
        Speaker: Anja Butter
      • 10:15 AM
        Coffee break 30m
      • 10:45 AM
        Bigger data, shorter time: Real-time inference on specialised hardware for scientific discovery 1h 15m
        Speaker: Thea Aarrestad
      • 12:00 PM
        Aspects of Deep Learning in Particle Flow 30m
        Speaker: Eilam Gross
    • 12:30 PM 2:00 PM
      Lunch break 1h 30m
    • 2:00 PM 10:00 PM
      Excursion & social dinner 8h
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 12:00 PM
      Invited speakers
      • 9:00 AM
        Open problems in generative models 1h 15m
        Speaker: Barnabas Poczos
      • 10:15 AM
        Coffee break 30m
      • 10:45 AM
        Generative Models - The Key to Manipulating Implicit Distribution for Bayesian Inference 1h 15m
        Speaker: Francois Lanusse
    • 12:00 PM 1:00 PM
      Conference synthesis 1h
      Speaker: Kyle Cranmer
    • 1:00 PM 2:30 PM
      Lunch break 1h 30m
    • 2:30 PM 6:00 PM
      Organized follow-up discussions 3h 30m