Hammers & Nails 2023 - Swiss Edition

Europe/Zurich
Congressi Stefano Franscini (CSF)

Congressi Stefano Franscini (CSF)

Monte Verità, Ascona, Switzerland
Description

Frontiers in Machine Learning in Cosmology, Astro & Particle Physics

October 29 – November 3, 2023  |  Conference center Congressi Stefano Franscini (CSF) in Monte Verità, Ascona, Switzerland
 

The Swiss Edition of Hammers & Nails in 2023 is following the success of the 2017, 2019 and 2022 Hammers & Nails workshops at Weizmann Institute of Science, Israel.

Cosmology, astro, and particle physics are constantly pushing forward the boundary of human knowledge further into the previously Unknown, fueled by open mysteries such as dark matter, dark energy or quantum gravity: simulating the Universe, simulating trillions of LHC particle collisions, searching for feeble anomalous signals in a deluge of data or inferring the underlying theory of nature by use of data which has been convolved with complex detector responses.  The machine learning hammer has already proven itself useful to decipher our conversation with the Unknown.

What is holding us back and where is cutting-edge machine learning expected to triumph over conventional methods?  This workshop will be an essential moment to explore open questions, foster new collaborations and shape the direction of ML design and application in these domains.  
An overarching theme is given by unsupervised and generative models which have excelled recently given the success of transformers, diffusion and foundational models. Other success stories include simulation-based inference, optimal transport, active learning and anomaly detection. The community has also taken inspiration from the rise of machine learning in many other domains such as in molecular dynamics.

The trademark of Hammers & Nails is an informal atmosphere with open-ended lectures spanning academia and industry, and a stage for early-career scientists, with time for free discussion and collaboration.

Participation is by invitation. Limited admission through submission of an abstract and a brainstorming idea is available with focus on early-career scientists.

Confirmed invited speakers and panelists:

  • Thea Aarrestad (ETH Zürich)
  • Piotr Bojanowski (Meta AI)
  • Michael Bronstein (University of Oxford | Twitter)
  • Taco Cohen (Qualcomm AI Research)
  • Kyle Cranmer (University of Wisconsin-Madison)
  • Michael Elad (Technion)
  • Eilam Gross (Weizmann Institute of Science)
  • Loukas Gouskos (CERN)
  • Lukas Heinrich (Technical University of Munich)
  • Shirley Ho  (Center for Computational Astrophysics at Flatiron Institute)
  • Michael Kagan (SLAC)
  • Francois Lanusse (CNRS)
  • Ann Lee (Carnegie Mellon University)
  • Laurence Levasseur (University of Montréal | Mila)
  • Qianxiao Li (National University of Singapore)
  • Jakob Macke (Tübingen University)
  • Alexander G. D. G. Matthews (Google Deep Mind)
  • Jennifer Ngadiuba (Fermilab)
  • Kostya Novoselov (University of Singapore | Nobel laureate 
  • Mariel Pettee (Berkeley Lab)
  • Barnabas Poczos (Carnegie Mellon University)
  • Jesse Thaler (MIT)
  • Andrey Ustyuzhanin (Higher School of Economics)
  • Ramon Winterhalder (UC Louvain)

 

Scientific Organizing Committee:

  • Tobias Golling (University of Geneva)
  • Danilo Rezende (Google Deep Mind)
  • Robert Feldmann (University of Zurich)
  • Slava Voloshynovskiy (University of Geneva)
  • Eilam Gross (Weizmann Institute of Science)
  • Kyle Cranmer (University of Wisconsin-Madison)
  • Ann Lee (Carnegie Mellon University)
  • Maurizio Perini (CERN)
  • Shirley Ho (Center for Computational Astrophysics at Flatiron Institute)
  • Tilman Plehn (University of Heidelberg)
  • Elena Gavagnin (Zurich University of Applied Sciences)
  • Peter Battaglia (Google Deep Mind)

 

https://www.epj.org/

Participants
  • Adrian Bayer
  • Alex Matthews
  • Alexander Shmakov
  • Andrey Ustyuzhanin
  • Anja Butter
  • Barnabas Poczos
  • Benedikt Maier
  • Benjamin Remy
  • Bruno Régaldo-Saint Blancard
  • Christian Kragh Jespersen
  • David Heurtel-Depeiges
  • Debajyoti Sengupta
  • Dmitrii Kobylianskii
  • Eilam Gross
  • Elena Gavagnin
  • Etienne Dreyer
  • Franco Terranova
  • François Lanusse
  • Garrett Merz
  • Guillaume Quétant
  • Ivan Oleksiyuk
  • Jakob Macke
  • Jeffrey Krupa
  • Jennifer Ngadiuba
  • Jesse Thaler
  • Justine Zeghal
  • Kinga Anna Wozniak
  • Laurence Levasseur
  • Li Qianxiao
  • Louis Lyons
  • Loukas Gouskos
  • Lucrezia Rambelli
  • Lukas Golino
  • Malte Algren
  • Mariel Pettee
  • Mariia Drozdova
  • Matthew Leigh
  • Michael Elad
  • Michael Kagan
  • Moritz Scham
  • Nathalie Soybelman
  • Nilotpal Kakati
  • Philipp Denzel
  • Pratik Jawahar
  • Radha Mastandrea
  • Ramon Winterhalder
  • Robert Feldmann
  • Samuel Klein
  • Shirley Ho
  • Simon Schnake
  • Svyatoslav (Slava) Voloshynovskyy
  • Taco Cohen
  • Thea Aarrestad
  • Theo Heimel
  • Tilman Plehn
  • Tobias Golling
  • Tomke Schroer
  • Verena Kain
  • Vitaliy Kinakh
    • 6:00 PM 7:00 PM
      Registration & Reception 1h
    • 7:00 PM 8:30 PM
      Dinner 1h 30m
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 9:30 AM
      Introduction & welcome 30m
      Speaker: Tobias Golling (Universite de Geneve (CH))
    • 9:30 AM 12:15 PM
      Invited speakers
      • 9:30 AM
        Highlights of machine learning in particle physics for computer scientists 1h
        Speaker: Loukas Gouskos (CERN)
      • 10:30 AM
        Coffee break 30m
      • 11:00 AM
        Interdisciplinary Machine Learning for Fundamental Physics 1h 15m
        Speaker: Mariel Pettee (Lawrence Berkeley National Lab. (US))
    • 12:15 PM 2:00 PM
      Lunch break 1h 45m
    • 2:00 PM 3:15 PM
      Invited speakers
      • 2:00 PM
        Geometric Algebra Transformers: A Universal Architecture of Geometric Data 1h 15m
        Speaker: Taco Cohen
    • 3:15 PM 4:00 PM
      Coffee break 45m
    • 4:00 PM 6:00 PM
      Young Scientist Forum
      • 4:00 PM
        End-To-End Latent Variational Diffusion Models for Unfolding LHC Events. 10m

        High-energy collisions at the Large Hadron Collider (LHC) provide valuable insights into open questions in particle physics. However, detector effects must be corrected before measurements can be compared to certain theoretical predictions or measurements from other detectors. Methods to solve this inverse problem of mapping detector observations to theoretical quantities of the underlying collision, referred to as unfolding, are essential parts of many physics analyses at the LHC. We investigate and compare various generative deep learning methods for unfolding at parton level. We introduce a novel unified architecture, termed latent variation diffusion models, which combines the latent learning of cutting-edge generative art approaches with an end-to-end variational framework. We demonstrate the effectiveness of this approach for reconstructing global distributions of theoretical kinematic quantities, as well as for ensuring the adherence of the learned posterior distributions to known physics constraints. Our unified approach improves the reconstruction of parton-level kinematics as measured by several distribution-free metrics.

        Speaker: Alexander Shmakov (University of California Irvine (US))
      • 4:10 PM
        PC-Droid: Jet generation with diffusion 10m

        Building on the success of PC-JeDi we introduce PC-Droid, a substantially improved diffusion model for the generation of jet particle clouds. By leveraging a new diffusion formulation, studying more recent integration solvers, and training on all jet types simultaneously, we are able to achieve state-of-the-art performance for all types of jets across all evaluation metrics. We study the trade-off between generation speed and quality by comparing two attention based architectures, as well as the potential of consistency distillation to reduce the number of diffusion steps. Both the faster architecture and consistency models demonstrate performance surpassing many competing models, with generation time up to two orders of magnitude faster than PC-JeDi and three orders of magnitude faster than Delphes.

        Speaker: Mr Matthew Leigh (University of Geneva)
      • 4:20 PM
        Drapes: Diffusion for weak supervision 10m

        We employ the diffusion framework to generate background enriched templates to be used in a downstream Anomaly Detection task (generally with CWoLa). We show how Drapes encompasses all modes of template generation, common in literature, and show State-of-the-art performance on the public RnD LHCO dataset.

        Speaker: Debajyoti Sengupta (Universite de Geneve (CH))
      • 4:30 PM
        Self-supervised learning of jets using a realistic detector simulation 10m

        Self-supervised learning (SSL) is a technique to obtain descriptive representations of data in a pretext task based on unlabeled input. Despite being well established in fields such as natural language processing and computer vision, SSL applications in high energy physics (HEP) have only just begun to be explored. Further research into SSL in the context of HEP is especially motivated given the potential to leverage enormous datasets collected by LHC experiments for training without labels. We demonstrate an SSL model of jet representations and its ability to express both global information and jet substructure. Furthermore, we investigate how SSL representations derived from low-level detector features can be used to search for exotic or anomalous jets in a largely unsupervised way. Going beyond the few existing studies in this direction, we conduct our studies using a realistic, state-of-the-art calorimeter simulation, such that our results are representative of possible future applications at collider experiments.

        Speakers: Dmitrii Kobylianskii (Weizmann Institute of Science (IL)), Etienne Dreyer (Weizmann Institute of Science (IL)), Nathalie Soybelman (Weizmann Institute of Science (IL)), Nilotpal Kakati (Weizmann Institute of Science (IL)), Patrick Rieck (New York University (US))
      • 4:40 PM
        De-noising Graph Super-Resolution with Diffusion Models and Transformers 10m

        Accurate reconstruction of particles from detector data forms the core problem in experimental particle physics. The spatial resolution of the detector, in particular the calorimeter granularity, is both influential in determining the quality of the reconstruction, and largely sets the upper limit for the algorithm's theoretical capabilities. To address these limitations, super-resolution techniques can offer a promising approach by enhancing low-resolution detector data to achieve higher resolution.

        In addition to image generation, Diffusion models have demonstrated effectiveness in super-resolution tasks. Given its sparsity and non-homogeneity, calorimeter data can be most faithfully represented using graphs. Therefore, this study introduces a novel approach to graph super-resolution using diffusion and a transformer-based de-noising network. This work represents the first instance of applying graph super-resolution with diffusion. The low-resolution image, which corresponds to recorded detector data, is also subject to noise from various sources. As an added benefit, the proposed model aims to remove these noise artifacts, further contributing to improved particle reconstruction.

        Speaker: Nilotpal Kakati (Weizmann Institute of Science (IL))
      • 4:50 PM
        Field-Level Inference with Microcanonical Langevin Monte Carlo 10m

        Extracting optimal information from upcoming cosmological surveys is a pressing task, for which a promising path to success is performing field-level inference with differentiable forward modeling. A key computational challenge in this approach is that it requires sampling a high-dimensional parameter space. In this talk I will present a new promising method to sample such large parameter spaces, which improves upon the traditional Hamiltonian Monte Carlo, to both reconstruct the initial conditions of the Universe and obtain cosmological constraints.
        (Based on https://arxiv.org/abs/2307.09504 and further new results.)

        Speaker: Adrian Bayer (Princeton University / Simons Foundation)
      • 5:00 PM
        Novel Approaches for Fast Simulation in HEP using Diffusion and Graph-to-Graph Translation 10m

        The simulation of particle physics data is a fundamental but computationally intensive ingredient for physics analysis at the Large Hadron Collider. In traditional fast simulation schemes, a surrogate calorimeter model is the basis for a set of reconstructed particles. We demonstrate the feasibility of generating the reconstructed objects in one step, replacing both the calorimeter simulation and reconstruction step. Our previous model that employed slot attention achieved promising results on a simplified synthetic dataset. In this work, we propose two novel approaches to improve this task and evaluate them on a more realistic dataset. In the first approach, we augment the slot-attention mechanism with a state-of-the-art diffusion model, where we start with a noisy graph and perform gradual noise reduction by solving Stochastic Differential Equations with gradient approximation and obtain the reconstructed particles. The second approach incorporates iterative graph refinement, where we directly transform the set of truth particles into the set of reconstructed particles. These approaches are able to go beyond our previous baseline performance in terms of both accuracy and resolution of the predicted particle properties.

        Speaker: Dmitrii Kobylianskii (Weizmann Institute of Science (IL))
      • 5:10 PM
        Precision-Machine Learning for the Matrix Element Method 10m

        The matrix element method is the LHC inference method of choice for limited statistics. We present a dedicated machine learning framework, based on efficient phase-space integration, a learned acceptance and transfer function. It is based on a choice of INN and diffusion networks, and a transformer to solve jet combinatorics. We showcase this setup for the CP-phase of the top Yukawa coupling in associated Higgs and single-top production.

        Speaker: Theo Heimel (Heidelberg University)
      • 5:20 PM
        Galaxies and Graphs 10m

        Graph Neural Networks are the premier method for learning the physics of a given system, since abstracting physical systems as graphs fits naturally with common descriptions of those systems. I will show how the fundamental processes that shape galaxies and dark matter halos can be learned efficiently by embedding galaxies and halos on either temporal or spatial graphs. Learning the temporal co-evolution of galaxies and their dark matter halos allows us to connect one of the most successful modern astrophysical theories, Lambda-CDM, with the poorly understood processes that shape galaxies, opening new pathways for both understanding, and speeding up simulations by ~6 orders of magnitude. Learning the spatial correlations between galaxies and halos also offers important insights into galaxy evolution, and lends itself more easily to comparisons with observations.

        Since GNNs work well with Symbolic Regression, I will also show how low-dimensional, analytic laws of galaxy formation can be derived from these models.

        Speaker: Christian Kragh Jespersen (Princeton University)
      • 5:30 PM
        Simulation-based Self-supervised Learning (S3L) 10m

        Self-Supervised Learning (SSL) is at the core of training modern large ML models, providing a scheme for learning powerful representations in base models that can be used in a variety of downstream tasks. However, SSL training strategies must be adapted to the type of training data, thus driving the question: what are powerful SSL strategies for collider physics data? In the talk, we present a novel simulation-based SSL (S3L) strategy wherein we develop a method of “re-simulation” to drive data augmentation for contrastive learning. We show how an S3L-trained base model can learn powerful representations that can be used for downstream discrimination tasks and can help mitigate uncertainties.

        Speakers: Benedikt Maier (KIT - Karlsruhe Institute of Technology (DE)), Jeffrey Krupa (Massachusetts Institute of Technology)
      • 5:40 PM
        Decorrelation using Optimal Transport 10m

        Novel decorrelation method using Convex Neural Optimal Transport Solvers (Cnots) that is able to decorrelate a continuous feature space against protected attributes with optimal transport. We demonstrate how well it performs in the context of jet classification in high energy physics, where classifier scores are desired to be decorrelated from the mass of a jet.

        Speaker: Malte Algren (Universite de Geneve (CH))
      • 5:50 PM
        The Interplay of Machine Learning–based Resonant Anomaly Detection Methods 10m

        Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal that make use of simulated or detected data in different ways, there has not yet been a study of the methods' complementarity. To this end, we address two questions. First, in the absence of any signal, do different methods pick the same events as signal-like? If not, then we can significantly reduce the false-positive rate by comparing different methods on the same dataset. Second, if there is a signal, are different methods fully correlated? Even if their maximum performance is the same, since we do not know how much signal is present, it may be beneficial to combine approaches. Using the Large Hadron Collider (LHC) Olympics dataset, we provide quantitative answers to these questions. We find that there are significant gains possible by combining multiple methods, which will strengthen the search program at the LHC and beyond.

        Speaker: Radha Mastandrea (University of California, Berkeley)
    • 6:30 PM 8:00 PM
      Young Scientist Forum: Poster session (incl. pizza, beer & insert cards)
      • 6:35 PM
        Are Differentiable Simulators Beneficial for Cosmological Simulation-Based Inference? 20m

        In many physical problems, the marginal likelihood of the data given physical parameters is intractable, making the inference of these parameters from data challenging. Simulation-Based Inference has emerged as a rigorous solution for solving this inference problem requiring only a black-box simulator. An ongoing research direction is to improve sample efficiency, which is especially relevant in Cosmology where simulations are very costly.
        We present our work on leveraging automatic differentiability of simulators to help reduce the number of simulations, an idea introduced as Gold Mining (Brehmer et al. 2019). We have in particular developed Neural Posterior Estimation methods that can benefit from accessing the simulator’s gradients and find that this approach does help on some problems (e.g. Lokta-Voltera). We then further investigate the practical gains of Gold Mining on a simplified Cosmological inference problem emulating a weak lensing analysis of LSST-Y10 data. We show how the amount of additional information provided by the simulator’s gradients is problem-dependent and, alas, not always significant. In our cosmological problem, we find that while having access to a differentiable simulator does have some benefits (e.g. HMC sampling of joint log-likelihoodl), it does not help SBI methods since gradient information is dominated by noise.

        Speaker: Justine Zeghal
      • 6:35 PM
        Cluster Scanning 20m

        We propose a new model independent method of new physics searches called cluster scanning (CS). It utilises k-means algorithm to perform clustering in the space of low-level event or jet observables, and separates potentially anomalous clusters to construct the anomaly rich region from the rest that form the anomaly poor region. The spectra of the invariant mass in these two regions are then used to determine whether a resonant signal is present. We apply this approach in a pseudo-analysis using the LHC Olympics R&D dataset and demonstrate the performance gains over the methods based on the global n-parameter function fits commonly used in bump hunting. Emphasis is put on the speed and simplicity of the method.

        Speaker: Mr Ivan Oleksiyuk (UNIGE)
      • 6:35 PM
        Cosmic Perspectives: A Comparative Study of Image-to-Image Translation Methods with an Emphasis on Geometrical Alignment 20m

        Image-to-image translation is a important problem across various fields, including Cosmology and Astrophysics. The image-to-image translation can facilitate the unraveling the mysteries of the universe. While many empirical approaches have been proposed to address this problem, they often lack a solid theoretical basis that could generalize them.

        In this work, we explore the image-to-image translation challenge, focusing on predicting future satellite Webb-images using existing Hubble-images. We benchmark several translation methods including Pix2Pix, CycleGAN, and the DDPM-model based Palette.

        We introduce 'Turbo,' a novel image-to-image translation framework that combines features from both paired and unpaired approaches. Turbo generalizes these methodologies by emphasizing the critical role of synchronization between image pairs in translation tasks.

        We propose a framework that leverages the stochasticity of the DDPM to measure uncertainty in image-to-image translation. This framework adds a layer of robustness and applicability, especially in the context of astronomical image-to-image translation.

        Our comparative analysis utilizes a comprehensive suite of metrics, including MSE, SSIM, PSNR, LPIPS, and FID, comparing the effectiveness and efficiency of our proposed methods.

        This study ia a step forward in image-to-image translation, combining theory and practical uses. We improve computer vision methods and also advance use of deep learning in Astrophysics.

        Speaker: Vitaliy Kinakh (University of Geneva)
      • 6:35 PM
        DeepTreeGAN: Fast Generation of High Dimensional Point Clouds 20m

        In High Energy Physics, detailed and time-consuming simulations are used for particle interactions with detectors. To bypass these simulations with a generative model, it needs to be able to generate large point clouds in a short time while correctly modeling complex dependencies between the particles.
        For non-sparse problems on a regular grid, such a model would usually use (De-)Convolution layers to up/down-scale the number of voxels.
        In this work, we present novel methods to up/down-scale point clouds. For the up-scaling, we propose the use of a feed-forward network to project each point to multiple. For the down-scaling, we propose a Message Passing Layer that connects a variable number of input points to a fixed number of trainable points.
        These operations allow us to construct a Graph GAN that is able to generate such point clouds in a tree-based manner. Particle showers are inherently tree-based processes, as each particle is produced by decays or detector interaction of a particle of the previous generation. We demonstrate the model's performance on the public JetNet and CaloChallange datasets.

        Speaker: Mr Moritz Scham (Deutsches Elektronen-Synchrotron (DE))
      • 6:35 PM
        Diffusion-Based Separation of CMB and Dust Emission: Enabling Cosmological Inference 20m

        The quest for primordial B-modes in cosmic microwave background (CMB) observations requires a refined model of the Galactic dust foreground. We investigate diffusion-based models of the dust foreground and their interest for both component separation and cosmological inference. First, under the assumption of a Gaussian CMB with known cosmology, we show that diffusion models can be trained on examples of dust emission maps in such a way that their sampling process directly coincides with posterior sampling in the context of component separation. We illustrate this on additive mixtures of dust emission maps produced from a magnetohydrodynamic simulation, and simulated CMB maps for a standard cosmology. Usual summary statistics (pixel distribution, power spectrum, Minkowski functionals) of the components are well recovered by this process. Second, in a context where the CMB cosmology is unknown, we train a diffusion model enabling posterior sampling conditioned on arbitrary cosmologies. We finally describe two independent methods leveraging this model for cosmological inference. If proven successful in future work, these methods would allow cosmological inference from a mixture of dust emission and CMB assuming any kind of dust prior.

        Speakers: Mr David Heurtel-Depeiges (Flatiron Institute (CCM)), Dr Ruben Ohana (Flatiron Institute (CCM)), Bruno Régaldo-Saint Blancard (Flatiron Institute (CCM))
      • 6:35 PM
        Domain adaption between SKA radio mocks and cosmological simulations 20m

        In this work, we investigate state-of-the-art deep-learning techniques for domain transfer applied to astrophysical images of simulated galaxies. Our main objective is to infer astrophysical properties, including galactic dark matter distribution, from observational data, such as radio interferometry from the upcoming Square-Kilometer Array (SKA) Observatory.

        To achieve this, we leverage large-scale cosmological hydrodynamics simulations, like the IllustrisTNG suite which generates thousands of galaxy models comprising gas, stars, and dark matter based on first principles. We compile and project these simulations into a multi-class image dataset, which serves as training data for various machine learning models.

        Generative models, such as conditional GANs, denoising diffusion, and flow-based models, have demonstrated successful learning of high-level features in natural images. However, their effectiveness when trained on astrophysical data with a significantly larger dynamic range remains largely untested.

        Here, we report on our ongoing efforts to train and evaluate these deep learning models. Our findings will contribute to a better understanding of their performance in the context of domain transfer, the inference of astrophysical galaxy properties, and, ultimately, of the formation and evolution of galaxies.

        Speaker: Dr Philipp Denzel (Centre for Artificial Intelligence, ZHAW)
      • 6:35 PM
        Finding strong lens by combining DenseLens and segmentation 20m

        Detecting strong lenses in a large dataset such as Euclid is very challenging due to the unbalanced nature of dataset. Existing CNN models are producing large amount of false positives, for example one strong lens candidate will be accompanied by 100's of false positives in the final sample. To over come this challenge, we have developed a novel ML pipeline called DenseLens, which consists of three components namely Classification ensemble, Regression ensemble and Segmentation. Classification ensemble is an ensemble of DenseNet-CNNs which provides predictions in range [0,1] and Regression ensemble rank-orders strong lenses based on Information Content i.e., higher the rank, the candidate has more visually convincing features. Finally we use the segmentation model to predict the source pixels of the rank-ordered image. We use this additional information from this predicted source pixels to classify whether the candidate is a strong lens or not. We applied this the novel approach of combing different ML models to the Kilo Degree Survey (KiDS) data and we reduced the false positives by an enormous factor.

        Speaker: Bharath Chowdhary Nagam (Kapteyn Astronomical Institute)
      • 6:35 PM
        Learning the Reionization History from High-z Quasar Damping Wings with Simulation-based Inference 20m

        The damping wing signature of high-redshift quasars in the intergalactic medium (IGM) provides a unique way of probing the history of reionization. Next-generation surveys will collect a multitude of spectra that call for powerful statistical methods to constrain the underlying astrophysical parameters such as the global IGM neutral fraction as tightly as possible. Inferring these parameters from the observed spectra is challenging because non-Gaussian processes like IGM transmission causing the damping wing imprint make it impossible to write down the correct likelihood of the spectra.
        We will present a simulation-based HMC inference scheme based on realistic forward-modelling of high-redshift quasar spectra including IGM transmission and heteroscedastic observational noise. To this end, we train a normalizing flow as neural likelihood estimator as well as a binary classifier as likelihood ratio estimator and incorporate them into our fully differentiable JAX-based inference pipeline.
        We provide a reionization constraint forecast for Euclid by applying our procedure to a set of mock observational spectra resembling the distribution of Euclid quasars and realistic spectral noise. By inferring the IGM neutral fraction as a function of redshift, we show that our method can robustly constrain its evolution up to ~5% at all redshifts between 6 and 11.

        Speaker: Timo Kist (Leiden Observatory)
      • 6:35 PM
        Machine Learning based Compression of Scientific Data - the HEP Perspective 20m

        One common issue in both research and industry is the growing data volumes and thereby the ever-increasing need for more data storage. With experiments taking more complex data at higher rates, the data recorded is quickly outgrowing the storage capabilities [1]. Since the data formats used are already highly compressed, storage constraints would require more drastic measures such as more exclusive event selection where a large portion of the data is discarded, or lossy compression, where data can be compressed beyond traditional lossless techniques as a result of some loss in resolution.

        As a potential solution to tailored lossy compression, we present Baler - an interdisciplinary, open-source, open-access tool for machine learning-based data compression. The tool uses autoencoders trained to compress and decompress data based on learned correlations. Interesting caveats are presented between offline and online compression, with studies on ways to efficiently overfit the data in the former. We show that, for common observables in high energy physics, where the precision loss is tolerable, the high compression ratio allows for more data to be stored yielding greater statistical power.

        [1] - https://cerncourier.com/a/time-to-adapt-for-big-data/

        Speaker: Pratik Jawahar (University of Manchester (UK - ATLAS))
      • 6:35 PM
        Pay Attention to Mean Fields for Point Cloud Generation 20m

        The generation of collider data using machine learning has emerged as a prominent research topic in particle physics due to the increasing computational challenges associated with traditional Monte Carlo simulation methods, particularly for future colliders with higher luminosity. The representation of collider data as particle clouds brings favourable benefits. The underlying physics provides knowledge about many complex correlations present in particle clouds. Since these can be calculated analytically they are used to test whether a generative model accurately approximates and samples the underlying probability density, which is itself a challenging task to solve. Additionally, variable particle cloud sizes further exacerbate these difficulties, necessitating more sophisticated models. In this work, we propose a novel model that utilizes an attention-based aggregation mechanism to address these challenges. The model is trained in an adversarial training paradigm, ensuring that both the generator and critic exhibit permutation equivariance/invariance with respect to their input. A novel feature matching loss in the critic is introduced to stabilize the training. The proposed model performs competitively to the state-of-art on the \textsc{JetNet150} dataset whilst having significantly fewer parameters than other state-of-art models. The model is then also applied to the CaloChallenge datasets and the results are discussed.

        Speaker: Benno Kach (Deutsches Elektronen-Synchrotron (DE))
      • 6:35 PM
        Radio-astronomical Image Reconstruction with Conditional Denoising Diffusion Model 20m

        Reconstructing accurate sky models from dirty radio images is crucial for advancing high-redshift galaxy evolution studies, especially when using ALMA. Existing pipelines often employ CLEAN algorithms followed by source detection methods. However, these pipelines struggle in low-SNR scenarios and cannot directly apply to idealized, noise-free sky models.

        We present a novel framework that uses stochastic neural networks for the direct reconstruction of sky models from dirty images. This approach not only enhances the accuracy of source localization and flux estimation but also integrates built-in uncertainty measures. Importantly, we introduce invertible normalization techniques specifically tailored for sky models and explore their impact.

        We validated our method on a dataset of ALMA images simulated with CASA. Source extraction from predicted sky models was performed using Photutils, and performance variations were assessed under different Precipitable Water Vapor (PWV) conditions.

        Our framework achieves 90% completeness in source representation at low SNR levels and accurately estimates fluxes in reconstructed sky models. While performance declines when testing and training PWV conditions differ, our method fills gaps unaddressed by existing pipelines such as CLEAN-based approaches.

        Speaker: Mariia Drozdova (Universite de Geneve (CH))
      • 6:35 PM
        Sampling high-dimensional inverse problem posteriors with neural score estimation 20m

        I will present a novel methodology to address many ill-posed inverse problems, by providing a description of the posterior distribution, which enables us to get point estimate solutions and to quantify their associated uncertainties. Our approach combines Neural Score Matching and a novel posterior sampling method based on an annealed HMC algorithm to sample the full high-dimensional posterior of our problem.

        In the astrophysical problem we address, by measuring the lensing effect on a large number of galaxies, it is possible to reconstruct maps of the Dark Matter distribution. But because of missing data and noise dominated measurement, this constitutes a challenging ill-posed inverse problem.

        We propose to reformulate the problem in a Bayesian framework, where the target becomes the posterior distribution of mass given the galaxies shape observations. The likelihood factor, describing how light-rays are bent by gravity, how measurements are affected by noise, and accounting for missing observational data, is fully described by a physical model. Besides, the prior factor is learned over cosmological simulations using Neural Score Matching and takes into account theoretical knowledge. We are thus able to obtain samples from the full Bayesian posterior and can perform Dark Matter mass-map reconstruction alongside uncertainty quantifications.

        Speaker: Mr Benjamin Remy (CEA Saclay)
      • 6:35 PM
        The Calorimeter Pyramid: Rethinking the design of generative calorimeter shower models 20m

        The simulation of calorimeter showers is computationally intensive, leading to the development of generative models as substitutes. We propose a framework for designing generative models for calorimeter showers that combines the strengths of voxel and point cloud approaches to improve both accuracy and computational efficiency. Our approach employs a pyramid-shaped design, where the base of the pyramid encompasses all calorimeter cells. Each subsequent level corresponds to a pre-defined clustering of cells from the previous level, which aggregates their energy. The pyramid culminates in a single cell that contains the total energy of the shower. Within this hierarchical framework, each model learns to calculate the energy of the hit cells at the current level and determines which cells are hit on the lower level. Importantly, each model only focuses on the 'hit' cells at its level. The final models solely determine the energy of individual hit cells. To accommodate differences in the hit cell cardinality across levels, we introduce two new Set Normalizing Flows which utilize Set Transformers and Deep Sets. Moreover, we propose a newly designed dequantization technique tailored for learning boolean values. We validate the framework on multiple datasets, including CaloChallenge.

        Speaker: Simon Schnake (DESY / RWTH Aachen University)
      • 6:35 PM
        Track finding and fitting with differentiable programming 20m

        The injection of physics principles for training machine learning algorithms is an active area of research and development within the particle physics community. In this contribution we present a novel methodology, based on differentiable programming tools, for pattern recognition and track fitting in muon chambers with high noise rates.

        The developed architecture centers around transformers for assessing the probability of a hit to pertain to a muon track. Concurrently, throughout the training's minimization iterations, a differentiable track fit is executed, precisely constraining the selected hits onto a helical trajectory.

        We showcase the dual impact of this approach, not only enhancing the model's performance but also contributing to the network's broader generalization.

        Speaker: Lucrezia Rambelli (University of Genova (IT))
      • 6:35 PM
        TURBO: The Swiss Knife of Auto-Encoders 20m

        In this study, we present a novel information-theoretic framework, termed as TURBO, designed to systematically analyse and generalise auto-encoding methods. We examine the principles of information bottleneck and bottleneck-based networks in the auto-encoding setting and identify their inherent limitations, which become more prominent for data with multiple relevant, physics-related representations. The TURBO framework is introduced, its core concept consisting in the maximisation of mutual information between various data representations expressed in two directions reflecting the information flows. We illustrate that numerous prevalent neural network models are encompassed within this framework. The study underscores the insufficiency of the information bottleneck concept in elucidating all such models, thereby establishing TURBO as a preferable theoretical reference. The introduction of TURBO contributes to a richer understanding of data representation and the structure of neural network models, enabling more efficient and versatile applications.

        Speaker: Guillaume Quétant (Université de Genève (CH))
      • 6:35 PM
        Unlocking Autonomous Telescopes through Reinforcement Learning: An Offline Framework and Insights from a Case Study 20m

        Optimizing observational astronomy campaigns is becoming a complex and expensive task for next-generation telescopes, where manual planning of observations may tend to reach suboptimal results in terms of optimization.
        Reinforcement Learning (RL) has been well-demonstrated as a valuable approach for training autonomous systems, and it may provide the basis for self-driving telescopes capable of scanning the sky and optimizing the scheduling for astronomy campaigns.
        We have developed a framework for the optimization of telescope scheduling using RL techniques, based on a dataset containing data on a discrete set of sky locations that the telescope should visit, and a reward metric. We compared several RL algorithms applied to an offline simulation dataset based at the Stone Edge Observatory, considering a discrete set of sky locations to visit and using “t-effective” as a reward metric, a measure of the quality of the data.
        Deep Q-Networks (DQNs), belonging to the class of value-based methods, have shown remarkable success in the optimization of astronomical observations in our dataset. In the full environment, the average reward value in each state was found to be 92%±5% of the maximum possible reward, while on the test set it resulted in 87%±9% of the maximum possible reward.

        Speaker: Franco Terranova (University of Pisa, Fermi National Accelerator Laboratory)
      • 6:35 PM
        Unsupervised and Weakly Supervised Machine Learning Enhanced Anomaly Detection in High Energy Physics 20m

        Unsupervised and weakly supervised techniques in machine learning can boost conventional methods for anomaly detection in HEP and open up a path for model-agnostic searches. Challenges posed by HEP data, including its voluminous nature and intricate structure, as well as insights drawn from the studies of manifold models are adressed.

        Speaker: Kinga Anna Wozniak (Universite de Geneve (CH))
      • 6:35 PM
        Using machine learning to detect antihydrogen in free fall 20m

        The properties of the hydrogen atom have played a central role in fundamental physics for the past 200 years. The CPT theorem, a cornerstone of the standard model, requires that hydrogen and antihydrogen ($\bar{H}$) have the same properties. The ALPHA antihydrogen experiment attempts to test this theory by measuring the fundamental properties of antihydrogen. We have previously measured the 1S-2S transition frequency, hyperfine structure, and other properties of the antihydrogen atom; but now sets our sights on the effect of gravity on antimatter.
        To perform this measurement a completely new 3m tall apparatus (ALPHA-G) was built, to see the effects of antihydrogen in free fall, first measurements in this new machine were taken in 2022.
        To detect these rare particles ALPHA makes use of several particle detector technologies, including a Silicon Vertex Detector (ALPHA-2), and a Time Projection Chamber (ALPHA-G). One of the key challenges for both detector systems is being able to distinguish between $\bar{H}$ annihilations and cosmic rays, a classification problem well suited for machine learning.
        Here we present the preliminary results of this “free fall” experiment in ALPHA-G, as well as describing how machine learning is used to determine the difference between signal and background.

        Speaker: Lukas Golino (Swansea University (GB))
    • 8:00 PM 9:15 PM
      Invited speakers
      • 8:00 PM
        Physics-inspired learning on graphs 1h 15m
        Speaker: Michael Bronstein
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 12:00 PM
      Invited speakers
    • 12:00 PM 1:30 PM
      Lunch 1h 30m
    • 4:30 PM 7:00 PM
      Brainstorming - CANCELLED
    • 7:00 PM 8:30 PM
      Dinner 1h 30m
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 10:00 AM
      Young Scientist Forum
      • 9:00 AM
        How will AI enable autonomous particle accelerators? 15m

        The need for greater flexibility, faster turnaround times, reduced energy consumption, reducing operational cost at maximum physics output and the sheer size of potential future accelerators such as the FCC ask for new particle accelerator operational models with automation at the center. AI/ML is already playing a significant role in the accelerator domain with numerous applications in design, diagnostics and control. This contribution will define the building blocks for autonomous accelerators as had been discussed in the CERN accelerator sector initiative called the Efficiency Think Tank and outline where AI would need to be applied for reaching quasi full automation. Equipment design considerations, control system requirements as well as necessary software frameworks will be summarized. And finally the remaining questions and challenges will be mentioned.

        Speaker: Verena Kain (CERN)
      • 9:15 AM
        Turbo-Sim Framework 15m
        Speaker: Slava Voloshinovskiy
      • 9:30 AM
        Accelerating graph-based tracking with symbolic regression 10m

        In high-energy physics experiments, tracking, the reconstruction of particle trajectories from hits in the inner detector, is a computationally intensive task due to the large combinatorics of detector signals. Recent efforts have proven that ML techniques can be successfully applied to the tracking problem, extending and improving the conventional methods based on feature engineering. However, the inference of complex networks can be too slow to be used in the trigger system. Quantising the network and deploying it on an FPGA is feasible but challenging and highly non-trivial. An efficient alternative can employ symbolic regression (SR), which already proved its performance in replacing a dense neural network for jet classification. We propose a novel approach that uses SR to replace a graph-based neural network. Using a simplified toy-example, we substitute each network block with a symbolic function, preserving the graph structure of the data and enabling message passing. This approach significantly speeds up inference on a CPU without sacrificing much accuracy.

        Speaker: Nathalie Soybelman (Weizmann Institute of Science (IL))
      • 9:40 AM
        Masked particle modelling 10m

        The Bert pretraining paradigm has proven to be highly effective in many domains including natural language processing, image processing and biology. To apply the Bert paradigm the data needs to be described as a set of tokens, and each token needs to be labelled. To date the Bert paradigm has not been explored in the context of HEP. The samples that form the data used in HEP can be described as a set of particles (tokens) where each particle is represented as a continuous vector. We explore different approaches for discretising/labelling particles such that the Bert pretraining can be performed and demonstrate the utility of the resulting pretrained models on common downstream HEP tasks.

        Speaker: Samuel Byrne Klein (Universite de Geneve (CH))
      • 9:50 AM
        Using transformers to calculate scattering amplitudes 10m

        We pursue the use of Transformers to compute scattering amplitudes in planar N = 4 super-Yang-Mills theory, a quantum field theory closely related to Quantum Chromodynamics (QCD). By expanding multiple polylogarithm functions in the Feynman integrals using the symbol map, we formulate scattering amplitudes in a language-based representation that is amenable to Transformer architectures and standard training objectives. We then show that an encoder-decoder Transformer can achieve high accuracy (> 98%) on two tasks in this representation- prediction of the integer coefficients of individual terms at a given loop order from the terms themselves, and prediction of coefficients at one loop order from a related subset of coefficients at a lower loop order. Finally, we explore interesting properties of the learning dynamics and representations learned by our model.

        Speaker: Garret Merz
    • 10:10 AM 10:30 AM
      Coffee break 20m
    • 10:30 AM 12:10 PM
      Invited speakers
      • 10:30 AM
        The Vision of End-to-End ML models in HEP 50m
        Speaker: Lukas Alexander Heinrich (Technische Universitat Munchen (DE))
      • 11:20 AM
        Toward Building Large HEP Models with Self-Supervised Learning 50m
        Speaker: Michael Kagan (SLAC National Accelerator Laboratory (US))
    • 12:10 PM 1:10 PM
      Lunch break 1h
    • 1:10 PM 1:11 PM
      Industry-academia forum - CANCELLED
      • 1:10 PM
        Industry-academia panel - CANCELLED 1m
        Speakers: Jennifer Ngadiuba, Jesse Thaler, Mariel Pettee (Lawrence Berkeley National Lab. (US)), Michael Kagan (SLAC National Accelerator Laboratory (US)), Taco Cohen
    • 1:15 PM 2:30 PM
      Invited speakers
    • 2:30 PM 7:00 PM
      Invited speakers
    • 7:00 PM 8:30 PM
      Dinner 1h 30m
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 12:30 PM
      Invited speakers
      • 9:00 AM
        Machine learning for the LHC Simulation Chain 1h 15m
        Speaker: Ramon Winterhalder (UC Louvain)
      • 10:15 AM
        Coffee break 30m
      • 10:45 AM
        Bigger data, shorter time: Real-time inference on specialised hardware for scientific discovery 1h 15m
        Speakers: Thea Aarrestad (ETH Zurich (CH)), Thea Aarrestad
      • 12:00 PM
        Aspects of Deep Learning in Particle Flow 30m
        Speakers: Eilam Gross, Etienne Dreyer (Weizmann Institute of Science (IL))
    • 12:30 PM 12:31 PM
      Excursion & social dinner - CANCELLED 1m
    • 12:30 PM 2:00 PM
      Lunch break 1h 30m
    • 2:00 PM 6:15 PM
      Invited speakers
    • 8:00 AM 9:00 AM
      Breakfast 1h
    • 9:00 AM 12:00 PM
      Invited speakers
    • 12:00 PM 1:00 PM
      Conference synthesis 1h
      Speaker: Kyle Cranmer
    • 1:00 PM 1:05 PM
      Outro 5m
      Speaker: Tobias Golling (Universite de Geneve (CH))
    • 1:05 PM 2:35 PM
      Lunch break 1h 30m
    • 2:35 PM 6:05 PM
      Organized follow-up discussions 3h 30m