3rd IML Machine Learning Workshop

Name: 3rd IML Machine Learning Workshop
Start: 2019-04-15T09:00:00+02:00
End: 2019-04-18T20:00:00+02:00
Location: CERN

15 Apr 2019, 09:00 → 18 Apr 2019, 20:00 Europe/Zurich

500/1-001 - Main Auditorium (CERN)

500/1-001 - Main Auditorium

CERN

400

Show room on map

David Rousseau (LAL-Orsay, FR), Lorenzo Moneta (CERN), Markus Stoye (CERN), Paul Seyfert (CERN), Rudiger Haake (Yale University (US)), Steven Randolph Schramm (Universite de Geneve (CH))

Description

This is the third annual workshop of the LPCC inter-experimental machine learning working group. As usual, it will take place at CERN, and everyone interested in ML for HEP is invited! While we encourage you to join us at CERN if possible to maximally benefit from this event, remote participation will be supported via the Vidyo and CERN webcast services.

The event will take place from April 15-18, 2019. There will be keynote talks from invited (non-HEP) ML experts, an industry session, hands-on tutorials, and of course, talks by people in HEP working on ML.

We encourage you to submit abstracts on your work! The deadline for abstract submission was March 8, 2019.

We look forward to welcoming you or hearing from you during the event!

Participants

430 View full list

Webcast

There is a live webcast for this event

Monday 15 April
- 09:00 → 10:50
  Invited keynote talks 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: Paul Seyfert (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
  - 09:00
    
    Welcome 20m
    
    Speaker: Steven Randolph Schramm (Universite de Geneve (CH))
    
    Welcome.pdf
  - 09:20
    
    Conceptual overview of ML in HEP 45m
    
    This talk is intended to provide a conceptual overview of how ML is used in particle physics, in order to provide a basic level of understanding for all attendees such that people can follow the subsequent talks. Detailed tutorials on the application of ML techniques will instead take place on Friday.
    
    Speaker: Dr Sergei Gleyzer (University of Florida (US))
    
    IML_2019.pptx
  - 10:05
    
    Future areas of focus for ML in particle physics 45m
    
    Speaker: Kyle Stuart Cranmer (New York University (US))
    
    IML-3rd-future-ML-physics.pdf
- 10:50 → 11:15
  
  Coffee break 25m 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
- 11:15 → 12:15
  Joint lecture/seminar: IML + PHYSTAT + CERN-DS 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: Louis Lyons (Imperial College (GB)), Olaf Behnke (Deutsches Elektronen-Synchrotron (DE))
  - 11:15
    
    On the Statistical Mechanics and Information Theory of Deep Learning for Particle Physicists 1h
    
    This is an introduction to Deep Learning, for Particle Physicists. These techniques can provide excellent performance for separating different categories of events, pattern recognition, etc. In order to achieve this it is highly desirable for users to understand the basic concepts underlying their operation. The Statistical Mechanics and Information Theory aspects of this will be addressed in this talk.
    
    Speaker: Prof. Naftali Tishby (Hebrew University of Jerusalem)
    
    CERN 1 - 2019 -Tishby.pdf
    
    CERN 1 - 2019 -Tishby.pptx
    
    DS seminar agenda link
    
    PHYSTAT website
    
    Prof. Dr. Tishby's website
- 12:30 → 14:00
  
  Lunch 1h 30m 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
- 14:00 → 18:00
  Industry talks and panel 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: David Rousseau (LAL-Orsay, FR), Markus Stoye (CERN)
  - 14:00
    
    Introduction 20m
    
    Speaker: Markus Stoye (CERN)
    
    Industry.pdf
  - 14:20
    
    Amazon AI: Tensor and Higher-Order Generalizations of the GSVD with Applications to Personalized Cancer Medicine 45m
    
    The number of high-dimensional datasets recording multiple aspects of interrelated phenomena is increasing in many areas, from medicine to finance. This drives the need for mathematical frameworks that can simultaneously identify the similar and dissimilar among multiple matrices and tensors, and thus create a single coherent model from the multiple datasets. The generalized singular value decomposition (GSVD) was formulated as such a comparative spectral decomposition of two column-matched but row-independent matrices. I will, first, define a higher-order GSVD (HO GSVD) and a tensor GSVD and prove that they extend almost all of the mathematical properties of the GSVD to multiple matrices and two tensors. Second, I will describe the development of a tensor HO GSVD for multiple tensors. Third, I will describe the use of these decompositions in the comparisons of cancer and normal genomes, where they uncover patterns that predict survival and response to treatment. The data had been publicly available for years, but the patterns remained unknown until the data were modeled by using the decompositions, illustrating their ability to find what other methods miss.
    
    Speaker: Dr Priya Ponnapalli (Amazon AI)
  - 15:05
    
    Google DeepMind: Compressing neural networks 45m
    
    Empirically it has been observed numerous times that trained neural networks often have high degrees of parameter-redundancy. It remains an open theoretical question why this parameter redundancy cannot be reduced before training by using smaller neural networks. On the other hand, the recent scientific literature reports a plethora of practical methods to "compress" neural networks during or after training with (almost) arbitrarily small sacrifices to task performance while significantly reducing the computational demands of a neural network model. This talk gives an overview over the field of neural network compression methods and then introduces one family of approaches based on Bayesian neural networks. Some of the appealing theoretical properties of Bayesian approaches to neural network compression are discussed and practical implementations for modern neural network training are sketched. The talk concludes with discussing difficulties with neural network compression in practice and an outlook towards exploiting noise and redundancy in the data for more compute-efficient neural networks.
    
    Speaker: Dr Tim Genewein (DeepMind)
    
    CERN_IML_Workshop_talk_Genewein.pdf
  - 16:00
    
    Coffee break 30m
  - 16:30
    
    B12 Consulting: Big value out of small data 30m
    
    Machine Learning (and especially Deep Learning) algorithms often require large amounts of data to accomplish their tasks. However, a common problem when such approaches are applied in business contexts is that only relatively small datasets are initially accessible, leading to a fundamental question: how to apply ML tools when there is apparently not enough data available? In this talk, I will discuss why those "small data" projects are actually of high relevance and discuss three concrete strategies data scientists can utilise to alleviate data deficiency and their strengths/weaknesses. For each strategy, I will present a business case scenario from the portfolio B12 Consulting, illustrating how to partially overcome analysis issues with insufficient data.
    
    Speaker: Dr Michel Herquet (B12)
  - 17:05
    
    Industry panel 55m
    
    Speakers: Dr Michel Herquet (B12), Dr Tim Genewein (DeepMind)
- 18:00 → 20:00
  
  Welcome reception 2h 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
Tuesday 16 April
- 09:00 → 10:45
  Submitted contributions: Session 1 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: Lorenzo Moneta (CERN), Paul Seyfert (CERN)
  - 09:00
    
    Daily announcements 5m
    
    pseyfert.pdf
    
    Recording
  - 09:05
    
    Machine Learning Uncertainties with Adversarial Neural Networks 20m
    
    Machine learning is a powerful tool to reveal and exploit correlations in a multi-dimensional parameter space. Making predictions from such correlations is a highly non-trivial task, in particular when the details of the underlying dynamics of a theoretical model are not fully understood. Uncertainties in the training data add towards the complexity of performing machine learning tasks such as event classification. However, it has been shown that adversarial neural networks can be used to decorrelate the trained model from systematic uncertainties that affect the kinematics on an event-by-event basis. Here we show that this approach can be extended to theoretical uncertainties (e.g. renormalization and factorization scale uncertainties) that affect the event sample as a whole. The result of the adversarial training is a classifier that is insensitive to these uncertainties by having learned to avoid regions of phase space (or feature space) that are affected by the uncertainties. This paves the way to a more reliable event classification, as well as novel approaches to perform parameter fits of particle physics data. We demonstrate the benefits of the method explicitly in an example considering effective field theory extensions of Higgs boson production in association with jets.
    
    Speaker: Dr Peter Galler (University of Glasgow)
    
    galler_ML_Uncertainties.pdf
    
    Recording
  - 09:25
    
    Decoding Physics Information in DNNs 20m
    
    A more dedicated study on the information flow in DNNs will help us understand their behaviour and the deep connection between DNN models and the corresponding tasks. Taking into account our well-established physics analysis framework (observable-based), we present a novel way to interpret DNNs results for HEP, which not only gives a clear physics picture but also inspires interfaces with the theoretical foundation. Information captured by DNNs can thus be used as a fine-tailored general-purpose encoder. As a concrete example, we showcase using encoded information to help with physics searches at the LHC.
    
    Speaker: Taoli Cheng (University of Montreal)
    
    Decoding Physics Information in DNNs
    
    Recording
  - 09:45
    
    Learning Invariant Representations using Mutual Information Regularization 20m
    
    Invariance of learned representations of neural networks against certain sensitive attributes of the input data is a desirable trait in many modern-day applications of machine learning, such as precision measurements in experimental high-energy physics and enforcing algorithmic fairness in the social and financial domain. We present a method for enforcing this invariance through regularization of the mutual information between the target variable and the classifier output. Applications of the proposed technique to rare decay searches in experimental high-energy physics are presented, and demonstrate improvement in statistical significance over conventionally trained neural networks and classical machine learning techniques.
    
    Speaker: Mr Justin Tan (University of Melbourne)
    
    meetings.pdf
    
    Recording
  - 10:05
    
    Neural networks for the abstraction of the physical symmetries in the nature 20m
    
    Neural networks are so powerful universal approximator of complicated patterns in large-scale data, leading the explosive developments of AI in terms of deep learning. However, in many cases, usual neural networks are trained to possess poor level of abstraction, so that the model's predictability and generalizability can be quite unstable, depending on the quality and amount of the data used for training. In this presentation, we introduce a new neural network architecture which has improved capability of capturing the key features and the physical laws hidden in data, in a mathematically more robust and simpler way. We demonstrate the performance of the new architecture, with an application for high energy particle scattering processes at the LHC.
    
    Speaker: Wonsang Cho (Seoul National University)
    
    CALU-theNQ_3rd_IML_wscho.pdf
    
    Recording
  - 10:25
    
    Containers for Machine Learning in HEP 20m
    
    Physicists want to use modern open source machine learning tools developed by industry for machine learning projects and analyses in high energy physics. The software environment that a physicist prototypes, tests, and runs these projects in is ideally the same regardless of compute site (be it their laptop or on the GRID). However, historically it has been difficult to find compute sites that have both the desired hardware resources for machine learning (i.e. GPUs) and a compatible software environment for the project, resulting in suboptimal use of resources and wasted researcher time tuning their software requirements to the imposed constraints. Container technologies, such as Docker and Singularity, provide a scalable and robust solution to this problem.
    
    We present work by Heinrich demonstrating the use of containers to run analysis jobs in reproducible compute environments at GRID endpoints with GPU resources that support Singularity. We additionally present complimentary work by Feickert that provides publicly available "base" Docker images of a HEP orientated machine learning environment: the CentOS 7 file system with the ATLAS "standalone" analysis release AnalysisBase, HDF5 support and utilities, and modern Python 3 with pip with libraries such as NumPy, TensorFlow and uproot installed. We further present ongoing synergetic work to expand both these efforts.
    
    Speaker: Matthew Feickert (Southern Methodist University (US))
    
    Containers for ML in HEP talk
    
    Feickert_HEPML_containers_2019-04-16.pdf
    
    Recording
- 10:45 → 11:15
  
  Coffee break 30m 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
- 11:15 → 12:30
  Invited plenary talk 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  - 11:15
    
    The information theory of Deep Learning 1h
    
    Speaker: Prof. Naftali Tishby (Hebrew University of Jerusalem)
    
    Recording
- 12:30 → 14:00
  
  Lunch 1h 30m 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
- 13:59 → 15:00
  Q&A with Prof. Naftali Tishby 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
  
  Convener: Louis Lyons (Imperial College (GB))
  - 14:15
    
    Q&A with Prof. Naftali Tishby 45m
    
    Speaker: Prof. Naftali Tishby (Hebrew University of Jerusalem)
- 14:00 → 16:00
  Submitted contributions: Session 2 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: David Rousseau (LAL-Orsay, FR), Steven Randolph Schramm (Universite de Geneve (CH))
  - 14:00
    
    Novelty Detection Meets Collider Physics 30m
    
    Novelty detection is the machine learning task to recognize data, which belong to an unknown pattern. Complementary to supervised learning, it allows to analyze data model-independently. We demonstrate the potential role of novelty detection in collider physics, using autoencoder-based deep neural network. Explicitly, we develop a set of density-based novelty evaluators, which are sensitive to the clustering of unknown-pattern testing data or new-physics signal events, for the design of detection algorithms. We also explore the influence of the known-pattern data fluctuations, arising from non-signal regions, on detection sensitivity. Strategies to address it are proposed. The algorithms are applied to detecting fermionic di-top partner and resonant di-top productions at LHC, and exotic Higgs decays of two specific modes at a future e+e− collider. With parton-level analysis, we conclude that potentially the new-physics benchmarks can be recognized with high efficiency.
    
    Speaker: Ms Ying-Ying Li (HKUST)
    
    NoveltyCERN_04162019.pdf
    
    Recording
  - 14:30
    
    Uncertain Networks 30m
    
    Machine learning methods are being increasingly and successfully applied to many different physics problems. However, currently uncertainties in machine learning methods are not modelled well, if at all. In this talk I will discuss how using Bayesian neural networks can give us a handle on uncertainties in machine learning. I will use tagging tops vs. QCD as an example of how these networks are competitive with other neural network taggers with the advantage of providing an event-by-event uncertainty on the classification. I will then further discuss how this uncertainty changes with experimental systematic effects, using pile-up and jet energy scale as examples.
    
    Speaker: Jennifer Thompson (ITP Heidelberg)
    
    bayesian_v2.pdf
    
    Recording
  - 15:00
    
    Exploring SMEFT in VH channel with Machine Learning 30m
    
    We use Machine Learning(ML) techniques to exploit kinematic information in VH, the production of a Higgs in association with a massive vector boson. We parametrize the effect of new physics in terms of the SMEFT framework. We find that the use of a shallow neural network allows us to dramatically increase the sensitivity to deviations in VH respect to previous estimates. We also discuss the relation between the usual measures of performance in Machine Learning, such as AUC or accuracy, with the more adept measure of Asimov significance. This relation is particularly relevant when parametrizing systematic uncertainties. Our results show the potential of incorporating Machine Learning techniques to the SMEFT studies using the current datasets.
    
    Speaker: Charanjit Kaur Khosa
    
    CKIMLtalk.pdf
    
    Recording
  - 15:30
    
    The Tracking Machine Learning challenge 30m
    
    The HL-LHC will see ATLAS and CMS see proton bunch collisions reaching track multiplicity up to 10.000 charged tracks per event. Algorithms need to be developed to harness the increased combinatorial complexity. To engage the Computer Science community to contribute new ideas, we have organized a Tracking Machine Learning challenge (TrackML). Participants were provided events with 100k 3D points, and are asked to group the points into tracks; they are also given a 100GB training dataset including the ground truth. The challenge is run in two phases. The first "Accuracy" phase has run on Kaggle platform from May to August 2018; algorithms were judged judged only on a score related to the fraction of correctly assigned hits. The second "Throughput" phase ran Sep 2018 to March 2019 on Codalab, required code submission; algorithms were then ranked by combining accuracy and speed. The first phase has seen 653 participants, with top performers with innovative approaches. The second phase has just finished and featured some astonishingly fast solution. The talk will report on the first lessons from the challenge.
    
    Speaker: David Rousseau (LAL-Orsay, FR)
    
    Recording
    
    tr190416_davidRousseau_IML_TrackML_final.pdf
    
    tr190416_davidRousseau_IML_TrackML_final.pptx
- 16:00 → 16:30
  
  Coffee break 30m 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
- 16:30 → 18:00
  CERN Colloquium: ML colloquium by Max Welling 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  - 16:30
    
    Gauge Fields in Deep Learning 1h 30m
    
    Invited colloquium by Prof. Dr. Max Welling
    
    Part of the IML workshop, but listed under the CERN colloquium category.
    
    Joint work with Taco Cohen, Maurice Weiler and Berkay Kicanaoglu
    
    ABSTRACT:
    Gauge field theory is the foundation of modern physics, including general relativity and the standard model of physics. It describes how a theory of physics should transform under symmetry transformations. For instance, in electrodynamics, electric forces may transform into magnetic forces if we transform a static observer to one that moves at constant speed. Similarly, in general relativity acceleration and gravity are equated to each other under symmetry transformations. Gauge fields also play a crucial role in modern quantum field theory and the standard model of physics, where they describe the forces between particles that transform into each other under (abstract) symmetry transformations.
    
    In this work we describe how the mathematics of gauge groups becomes inevitable when you are interested in deep learning on manifolds. Defining a convolution on a manifold involves transporting geometric objects such as feature vectors and kernels across the manifold, which due to curvature become path dependent. As such it becomes impossible to represent these objects in a global reference frame and one is forced to consider local frames. These reference frames are arbitrary and changing between them is called a (local) gauge transformation. Since we do not want our computations to depend on the specific choice of frames we are forced to consider equivariance of our convolutions under gauge transformations. These considerations result in the first fully general theory of deep learning on manifolds, with gauge equivariant convolutions as the necessary key ingredient.
    
    Link to the colloquium agenda
    
    Prof. Dr. Max Welling's homepage
Wednesday 17 April
- 09:00 → 12:30
  Submitted contributions: Session 3 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: Lorenzo Moneta (CERN), Paul Seyfert (CERN)
  - 09:00
    
    Daily announcements 5m
    
    pseyfert.pdf
  - 09:05
    
    Applying Generative Models to Scientific Research 30m
    
    Surrogate generative models demonstrate extraordinary progress in current years. Although most applications are dedicated to image generation and similar commercial
    goals, this approach is also very promising for natural sciences, especially for tasks like fast event simulation in HEP experiments. However, application of such generative models to scientific research implies specific requirements and expectations from these models. In the presentation, I'll discuss specific points which need attention when using generative models for scientific research. This includes ensuring that models satisfy different boundary conditions and match scientifically important but marginal statistics. We also need to establish procedures to evaluate the quality of the particular model, propagate model imperfection into systematic uncertainties of the final scientific result, and so on.
    
    Speaker: Fedor Ratnikov (Yandex School of Data Analysis (RU))
    
    GenerativeModels_IML3_190417.pdf
    
    Recording
  - 09:35
    
    DijetGAN: A Generative-Adversarial Network Approach for the Simulation of QCD Dijet Events at the LHC 20m
    
    In this talk, I will present a Generative-Adversarial Network (GAN) based on convolutional neural networks that is used to simulate the production of pairs of jets at the LHC. The GAN is trained on events generated using MadGraph5 + Pythia8, and Delphes3 fast detector simulation. A number of kinematic distributions both at Monte Carlo truth level and after the detector simulation can be reproduced by the generator network with a very good level of agreement.
    
    Speaker: Serena Palazzo (The University of Edinburgh (GB))
    
    3rd_IML_Serena_Palazzo.pdf
    
    Recording
  - 09:55
    
    High Granularity Calorimeter Simulation using Generative Adversarial Networks 20m
    
    High Energy Physics simulation typically involves Monte Carlo method. Today >50% of WLCG resources are used for simulation that will increase further as detector granularity and luminosity increase. Machine learning has been very successful in the field of image recognition and generation. We have explored image generation techniques for speeding up HEP detector simulation. Calorimeter responses can be treated as images with energy deposition interpreted as "pixel intensities". One important difference is that pixel luminosity usually cover a range of 0-255 while energy depositions can vary over many orders of magnitude. We have implemented a three dimensional detector simulation tool using Generative Adversarial Networks. Our initial implementation could generate detector response for different energies of the incoming particles at fixed angles in a 25x25x25 cell grid. We present an upgraded version able to simulate electron showers for variable angles of impact in addition to variable primary energies. The inclusion of angles required increasing the sample size in transverse direction to 51x51x25 cells, multiplying by four the number of outputs. Due to the complexity of the task, the range of primary energies has been initially limited to 100-200 GeV. Training was has been improved raising cell energies to power less than one. A check for correct angle is added to the cost function together with comparisons of cell energy distribution. Currently, the accuracy of the result is a bit lower than the fixed angle version but still within 10% for relevant shower parameters.
    
    Speaker: Gul Rukh Khattak (University of Peshawar (PK))
    
    iml_2019_3dgan.pdf
    
    Recording
  - 10:15
    
    Deep generative models for fast shower simulation in ATLAS 20m
    
    The extensive physics program of the ATLAS experiment at the Large Hadron Collider (LHC) relies on large scale and high fidelity simulation of the detector response to particle interactions. Current full simulation techniques using Geant4 provide accurate modeling of the underlying physics processes, but are inherently resource intensive. In light of the high-luminosity upgrade of the LHC and the need for ever larger simulated datasets to support physics analysis, the development of new faster simulation techniques is crucial. Building on the recent success of deep learning algorithms, Variational Auto-Encoders and Generative Adversarial Networks are investigated for modeling the response of the ATLAS electromagnetic calorimeter for photons in a central calorimeter region over a range of energies. The properties of synthesized showers using deep neural networks are compared to showers from a full detector simulation using Geant4. With this feasibility study we demonstrate the potential of using such algorithms for fast calorimeter simulation for the ATLAS experiment in the future, complementing current simulation techniques.
    
    Speaker: Aishik Ghosh (Centre National de la Recherche Scientifique (FR))
    
    IMLApril2019.pdf
    
    Recording
  - 10:35
    
    Coffee break 30m
  - 11:05
    
    Fast Simulation Using Generative Adversarial Network in LHCB 20m
    
    LHCb is one of the major experiments operating at the Large Hadron Collider at CERN. The richness of the physics program and the increasing precision of the measurements in LHCb lead to the need of ever larger simulated samples. This need will increase further when the upgraded LHCb detector will start collecting data in the LHC Run 3. Given the computing resources pledged for the production of Monte Carlo simulated events in the next years, the use of fast simulation techniques will be mandatory to cope with the expected dataset size. In LHCb generative models, which are nowadays widely used for computer vision and image processing are being investigated in order to accelerate the generation of showers in the calorimeter and high-level responses of Cherenkov detector. We demonstrate that this approach provides high-fidelity results along with a significant speed increase and discuss possible implication of these results. We also present an implementation of this algorithm into LHCb simulation software and validation tests.
    
    Speaker: Artem Maevskiy (National Research University Higher School of Economics (RU))
    
    FASTSIM-at-IML-v1.pdf
    
    Recording
  - 11:25
    
    Model-Assisted GANs for the optimisation of simulation parameters and as an algorithm for fast Monte Carlo production 20m
    
    We propose and demonstrate the use of a Model-Assisted Generative Adversarial Network to produce simulated images that accurately match true images through the variation of underlying model parameters that describe the image generation process. The generator learns the parameter values that give images that best match the true images. The best match parameter values that produce the most accurate simulated images can be extracted and used to re-tune the default simulation to minimise any bias when applying image recognition techniques to simulated and true images. In the case of a real-world experiment, the true data is replaced by experimental data with unknown true parameter values. The Model-Assisted Generative Adversarial Network uses a convolutional neural network to emulate the simulation for all parameter values that, when trained, can be used as a conditional generator for fast image production.
    
    Speaker: Mr Saul Alonso Monsalve (CERN)
    
    Recording
    
    sam_MAGAN_IML2019.pdf
    
    sam_MAGAN_IML2019.ppsx
  - 11:45
    
    Event Generation and Statistical Sampling with Deep Generative Models 20m
    
    We present a study for the generation of events from a physical process with generative deep learning. To simulate physical processes it is not only important to produce physical events, but also to produce the events with the right frequency of occurrence (density). We investigate the feasibility to learn the event generation and the frequency of occurrence with Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) to produce events like Monte Carlo generators. We study three toy models from high energy physics, i.e. a simple two-body decay, the processes $e^+e^-\to Z \to l^+l^-$ and $p p \to t\bar{t} $ including the decay of the top quarks and a simulation of the detector response. We show that GANs and the standard VAE do not produce the right distributions. By buffering density information of Monte Carlo events in latent space given the encoder of a VAE we are able to construct a prior for the sampling of new events from the decoder that yields distributions that are in very good agreement with real Monte Carlo events and are generated $\mathcal{O}(10^8)$ times faster. Applications of this work include generic density estimation and sampling, targeted event generation via a principal component analysis of encoded events in the latent space and the possibility to generate better random numbers for importance sampling, e.g. for the phase space integration of matrix elements in quantum perturbation theories. The method also allows to build event generators directly from real data events.
    
    Speaker: Sydney Otten (Radboud Universiteit Nijmegen)
    
    IML_workshop_event_generation_Sydney_Otten.pdf
    
    IML_workshop_event_generation_Sydney_Otten.pdf
    
    IML_workshop_event_generation_Sydney_Otten.pptx
    
    Recording
  - 12:05
    
    LUMIN - a deep learning and data science ecosystem for high-energy physics 20m
    
    LUMIN aims to become a deep-learning and data-analysis ecosystem for High-Energy Physics, and perhaps other scientific domains in the future. Similar to Keras and fastai it is a wrapper framework for a graph computation library (PyTorch), but includes many useful functions to handle domain-specific requirements and problems. It also intends to provide easy access to to state-of-the-art methods, but still be flexible enough for users to inherit from base classes and override methods to meet their own demands.
    
    In this talk I will be introducing the library, discussing some of its distinguishing characteristics, and going through an example workflow. There will also be a general invitation for people to test out the library and provide feedback, suggestions, or contributions.
    
    Speaker: Giles Chatham Strong (LIP Laboratorio de Instrumentacao e Fisica Experimental de Part)
    
    GS_IML_LUMIN.pdf
    
    Recording
- 12:30 → 14:00
  
  Lunch 1h 30m 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
- 14:00 → 18:40
  Submitted contributions: Session 4 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: Rudiger Haake (Yale University (US)), Steven Randolph Schramm (Universite de Geneve (CH))
  - 14:00
    
    A hybrid deep learning approach to vertexing 20m
    
    In the transition to Run 3 in 2021, LHCb will undergo a major luminosity upgrade, going from 1.1 to 5.6 expected visible Primary Vertices (PVs) per event, and will adopt a purely software trigger. This has fueled increased interest in alternative highly-parallel and GPU friendly algorithms for tracking and reconstruction. We will present a novel prototype algorithm for vertexing in the LHCb upgrade conditions.
    
    We use a custom kernel to transform the sparse 3D space of hits and tracks into a dense 1D dataset, and then apply Deep Learning techniques to find PV locations. By training networks on our kernels using several Convolutional Neural Network layers, we have achieved better than 90% efficiency with no more than 0.2 False Positives (FPs) per event. Beyond its physics performance, this algorithm also provides a rich collection of possibilities for visualization and study of 1D convolutional networks. We will discuss the design, performance, and future potential areas of improvement and study, such as possible ways to recover the full 3D vertex information.
    
    Speaker: Henry Fredrick Schreiner (University of Cincinnati (US))
    
    2019_IML_PvFinder.pdf
    
    Recording
  - 14:20
    
    Feature ranking based on subtraction methods 20m
    
    The input variables of ML methods in physics analysis are often highly correlated and figuring out which ones are the most important ones for the classification turns out to be a non-trivial tasks. We compare the standard method of TMVA to rank variables with a several newly developed methods based on iterative removal for the use case of a search for top pair associated Higgs production (ttH) in the Higgs to b-pair decay channel.
    
    Speaker: Paul Glaysher (DESY)
    
    featureRanking.pdf
    
    Recording
  - 14:40
    
    ML Techniques for heavy flavour identification in CMS 30m
    
    Jet flavour identification is a fundamental component for the physics program of the LHC-based experiments. The presence of multiple flavours to be identified leads to a multiclass classification problem. Moreover, the classification of boosted jets has acquired an increasing importance in the physics program of CMS. In this presentation we will present the performance on both simulated and real data of our latest resolved and boosted heavy flavour taggers as well as the future prospects for the evolution of these techniques and the technical strategies adopted to deploy them in the harsh computing environment of a large-scale HEP computing software stack.
    
    Speaker: Emil Sorensen Bols (Vrije Universiteit Brussel (BE))
    
    IML.pdf
    
    Recording
  - 15:10
    
    ParticleNet: Jet Tagging via Particle Clouds 20m
    
    How to represent a jet is at the core of machine learning on jet physics. Inspired by the notion of point cloud, we propose a new approach that considers a jet as an unordered set of its constituent particles, effectively a "particle cloud". Such particle cloud representation of jets is efficient in incorporating raw information of jets and also explicitly respects the permutation symmetry. Based on the particle cloud representation, we propose ParticleNet, a customized neural network architecture using Dynamic Graph CNN for jet tagging problems. The ParticleNet architecture achieves state-of-the-art performance on two representative jet tagging benchmarks and improves significantly over existing methods.
    
    Speaker: Huilin Qu (Univ. of California Santa Barbara (US))
    
    ParticleNet_IML_20190417_H_Qu.pdf
    
    Recording
  - 15:30
    
    Learning representations of irregular particle-detector geometry with distance-weighted graph networks 20m
    
    We explore the possibility of using graph networks to deal with irregular-geometry detectors when reconstructing particles. Thanks to their representation-learning capabilities, graph networks can exploit the detector granularity, while dealing with the event sparsity and the irregular detector geometry. In this context, we introduce two distance-weighted graph network architectures, the GarNet and the GravNet layers and we apply them to a typical particle reconstruction task. As an example, we consider a high granularity calorimeter, loosely inspired by the endcap calorimeter to be installed in the CMS detector for the High-Luminosity LHC phase. We focus the study on the basis for calorimeter reconstruction, clustering, and provide a quantitative comparison to alternative approaches. The proposed methods outperform previous methods or reach competitive performance while keeping favourable computing-resource consumption. Being geometry agnostic, they can be easily generalized to other use cases and to other detectors, e.g., tracking in silicon detectors.
    
    Speaker: Jan Kieseler (CERN)
    
    caloGraph_IML.pdf
    
    Recording
  - 16:00
    
    Coffee break 30m
  - 16:30
    
    GroomRL: jet grooming through reinforcement learning 20m
    
    We introduce a novel implementation of a reinforcement learning algorithm which is adapted to the problem of jet grooming, a crucial component of jet physics at hadron colliders. We show that the grooming policies trained using a Deep Q-Network model outperform state-of-the-art tools used at the LHC such as Recursive Soft Drop, allowing for improved resolution of the mass of boosted objects. The algorithm learns how to optimally remove soft wide-angle radiation, allowing for a modular jet grooming tool that can be applied in a wide range of contexts.
    
    Speaker: Frederic Alexandre Dreyer (Oxford)
    
    Recording
    
    talk_imlws19.pdf
  - 16:50
    
    NeuralRinger: An Ensemble of Neural Networks Fed from Calorimeter Ring Sums for Triggering on Electrons 20m
    
    In 2017, the ATLAS experiment implemented an ensemble of neural networks (NeuralRinger algorithm) dedicated to reduce the latency of the first, fast, online software (HLT) selection stage for electrons with transverse energy above 15 GeV. In order to minimize detector response and shower development fluctuations, and being inspired in the ensemble of likelihood models currently operating in the offline and final HLT selections, the ensemble comprises Multi-Layer Perceptron (MLP) models tuned for pseudo-rapidity and transverse energy bins. The MLPs are fed from calorimetry information formatted into concentric ring energy sums, which are built around the particle axis and normalized by its total transverse energy. Although triggering algorithms are typically developed from offline models adapted to operate in stringent online conditions, the NeuralRinger development was starting from the online perspective. We describe the analysis performed during the NeuralRinger development in the Run 2, the trigger commissioning and its online performance during the 2017 and 2018 data-taking. It is estimated that the Neural Ringer allowed a reduction of 25% in the e/$\gamma$ slice processing demands when considering its operation on top of all other improvements done in the electron chains. Statistical tests performed on crucial offline calorimeter response discriminant features show negligible impact (<1$\sigma$) in offline reconstruction.
    
    Speaker: Werner Spolidoro Freund (Federal University of of Rio de Janeiro (BR))
    
    20190417_iml_trigegamma_neuralringer.pdf
    
    Recording
  - 17:10
    
    Fast Deep Learning on FPGAs for the Phase-II L0 Muon Barrel Trigger of the ATLAS Experiment 20m
    
    The Level-0 Muon Trigger system of the ATLAS experiment will undergo a full upgrade for HL-LHC to stand the challenging performances requested with the increasing instantaneous luminosity. The upgraded trigger system foresees to send RPC raw hit data to the off-detector trigger processors, where the trigger algorithms run on new generation of Field-Programmable Gate Arrays (FPGAs). The FPGA represents an optimal solution in this context, because of its flexibility, wide availability of logical resources and high processing speed. Studies and simulations of different trigger algorithms have been performed, and novel low precision deep neural network architectures (based on ternary dense and convnet networks) optimized to run on FPGAs and to cope with sparse data are presented. Both physics performances in terms of efficiency and fake rates, and FPGA logic resource occupancy and timing obtained with the developed algorithms are presented.
    
    Speaker: Luigi Sabetta (Sapienza Universita e INFN, Roma I (IT))
    
    IML_2019.pdf
    
    Recording
  - 17:30
    
    Close-out 10m
    
    Speaker: Steven Randolph Schramm (Universite de Geneve (CH))
    
    ClosingRemarks.pdf
    
    Recording
Thursday 18 April
- 09:00 → 13:10
  Tutorials: Part 1 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: Lorenzo Moneta (CERN), Rudiger Haake (Yale University (US))
  - 09:00
    
    Introduction to the basics of deep learning 1h 30m
    
    Speaker: Yannik Alexander Rath (RWTH Aachen University (DE))
    
    Github repository with notebook
    
    Recording
  - 10:30
    
    Coffee break 30m
  - 11:00
    
    Physics inspired Autonomous Feature Engineering 1h
    
    Speaker: Marcel Rieger (RWTH Aachen University (DE))
    
    GitHub repository with notebooks
    
    Recording
  - 12:00
    
    Traditional approach to network architecture optimisation 30m
    
    Speaker: Andrey Ustyuzhanin (Yandex School of Data Analysis (RU))
    
    Github repository
    
    Recording
    
    Ustyuzhanin-NNOpt.pdf
  - 12:30
    
    Differentiable architecture search 40m
    
    Speaker: Andrey Ustyuzhanin (Yandex School of Data Analysis (RU))
    
    Recording
- 13:10 → 15:00
  
  Lunch 1h 50m 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
- 15:00 → 18:30
  Tutorials: Part 2 500/1-001 - Main Auditorium
  
  500/1-001 - Main Auditorium
  
  CERN
  
  400
  Show room on map
  
  Conveners: David Rousseau (LAL-Orsay, FR), Rudiger Haake (Yale University (US))
  - 15:00
    
    Bayesian approach to network design 30m
    
    Speaker: Andrey Ustyuzhanin (Yandex School of Data Analysis (RU))
    
    Recording
  - 15:30
    
    Bayesian dropout explanation and examples 40m
    
    Speaker: Andrey Ustyuzhanin (Yandex School of Data Analysis (RU))
    
    Recording
  - 16:15
    
    Coffee break 30m
  - 16:45
    
    Advanced Generative Adversarial Network Techniques 1h
    
    Speaker: Jonas Glombitza (Rheinisch-Westfaelische Tech. Hoch. (DE))
    
    GAN_advanced_techniques.pdf
    
    Recording

Choose timezone

3rd IML Machine Learning Workshop

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

31/3-004 - IT Amphitheatre

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN

500/1-001 - Main Auditorium

CERN