2nd IML Machine Learning Workshop
Inter-experimental Machine Learning Working Group Workshop on Machine Learning will be held between April 9 and 11, 2018. There will also be a full-day hackathon on April 12.
-
-
9:00 AM
→
12:30 PM
Conveners: Markus Stoye (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
-
9:00 AM
Welcome 20mSpeakers: Lorenzo Moneta (CERN), Markus Stoye (CERN), Paul Seyfert (CERN), Rudiger Haake (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
-
9:20 AM
DarkMachines 10m
http://www.darkmachines.org/
Speaker: Tommaso Dorigo (Universita e INFN, Padova (IT)) - 9:35 AM
-
10:30 AM
Coffee break 30m
-
11:00 AM
DeepJet: a deep-learned multiclass jet-tagger for slim and fat jets 20m
We present a customized neural network architecture for both, slim and fat jet tagging. It is based on the idea to keep the concept of physics objects, like particle flow particles, as a core element of the network architecture. The deep learning algorithm works for most of the common jet classes, i.e. b, c, usd and gluon jets for slim jets and W, Z, H, QCD and top classes for fat jets. The developed architecture promising gains in performance as shown in simulation of the CMS collaboration. Currently the tagger is under test in real data in the CMS experiment.
Speakers: Mauro Verzetti (CERN), Jan Kieseler (CERN), Markus Stoye (CERN), Huilin Qu (Univ. of California Santa Barbara (US)), Loukas Gouskos (Univ. of California Santa Barbara (US)) -
11:25 AM
HL-LHC tracking challenge 20m
At HL-LHC, the seven-fold increase of multiplicity wrt 2018 conditions poses a severe challenge to ATLAS and CMS tracking experiments. Both experiment are revamping their tracking detector, and are optimizing their software. But are there not new algorithms developed outside HEP which could be invoked: for example MCTS, LSTM, clustering, CNN, geometric deep learning and more?
We organize on the Kaggle platform a data science competition to stimulate both the ML and HEP communities to renew core tracking algorithms in preparation of the next generation of particle detectors at the LHC.In a nutshell : one event has 100.000 3D points ; how to associate the points onto 10.000 unknown approximately helicoidal trajectories ? avoiding combinatorial explosion ? you have a few seconds. But we do give you 100.000 events to train on.
We ran ttbar+200 minimum bias event into ACTS a simplified (yet accurate) simulation of a generic LHC silicon detectors, and wrote out the reconstructed hits, with matching truth. We devised an accuracy metric which capture with one number the quality of an algorithm (high efficiency/low fake rate).
The challenge will run in two phases: the first on accuracy (no stringent limit on CPU time), starting in April 2018, and the second (starting in the summer 2018) on the throughput, for a similar accuracy.Speaker: Jean-Roch Vlimant (California Institute of Technology (US)) -
11:50 AM
Convolutional Neural Network for Track Seed Filtering at the CMS HLT 20m
Collider will constantly bring nominal luminosity increase, with the ultimate goal of reaching a peak luminosity of $5 · 10^{34} cm^{−2} s^{−1}$ for ATLAS and CMS experiments planned for the High Luminosity LHC (HL-LHC) upgrade. This rise in luminosity will directly result in an increased number of simultaneous proton collisions (pileup), up to 200, that will pose new challenges for the CMS detector and, specifically, for track reconstruction in the Silicon Pixel Tracker.
One of the first steps of the track finding workflow is the creation of track seeds, i.e. compatible pairs of hits from different detector layers, that are subsequently fed to to higher level pattern recognition steps. However the set of compatible hit pairs is highly affected by combinatorial background resulting in the next steps of the tracking algorithm to process a significant fraction of fake doublets.
A possible way of reducing this effect is taking into account the shape of the hit pixel cluster to check the compatibility between two hits. To each doublet is attached a collection of two images built with the ADC levels of the pixels forming the hit cluster. Thus the task of fake rejection can be seen as an image classification problem for which Convolutional Neural Networks (CNNs) have been widely proven to provide reliable results.
In this work we present our studies on CNNs applications to the filtering of track pixel seeds. We will show the results obtained for simulated event reconstructed in CMS detector, focussing on the estimation of efficiency and fake rejection performances of our CNN classifier.
Speaker: Mr Adriano Di Florio (Universita e INFN, Bari (IT))
-
9:00 AM
-
12:30 PM
→
2:00 PM
Lunch break 1h 30m
-
2:00 PM
→
6:30 PM
Conveners: Markus Stoye (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
-
2:00 PM
Tutorial: TMVA 1hSpeaker: Lorenzo Moneta (CERN)
-
3:00 PM
Introduction to the industry session 10mSpeakers: Lorenzo Moneta (CERN), Markus Stoye (CERN), Paul Seyfert (CERN), Rudiger Haake (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
-
3:10 PM
Multilevel Optimization for Generative Models, Games and Robotics 30mSpeaker: David Pfau (Google DeepMind)
-
3:50 PM
Machine Learning For Enterprises: Beyond Open Source 30m
invited talk from IBM analytics
Speaker: Jean-Francois Puget (IBM Analytics) -
4:30 PM
Coffee break 30m
-
5:00 PM
Surrogate Models for Fun and Profit 30mSpeaker: Andrey Ustyuzhanin (Yandex School of Data Analysis (RU))
-
5:40 PM
Industry panel 30mSpeakers: Andrey Ustyuzhanin (Yandex School of Data Analysis (RU)), David Pfau (Google DeepMind), Jean-Francois Puget (IBM Analytics)
-
2:00 PM
-
6:30 PM
→
8:30 PM
Welcome reception 2h 500/1-001 - Main Auditorium
Snacks and drinks are provided - come and socialize with fellow machine learning enthusiasts!
-
9:00 AM
→
12:30 PM
-
-
9:00 AM
→
12:30 PM
Conveners: Lorenzo Moneta (CERN), Rudiger Haake (CERN)
-
9:00 AM
Daily announcements 5mSpeakers: Lorenzo Moneta (CERN), Markus Stoye (CERN), Paul Seyfert (CERN), Rudiger Haake (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
-
9:05 AM
Joint Wasserstein GAN contribution 45m
This is a merger of three individual contributions:
- https://indico.cern.ch/event/668017/contributions/2947026/
- https://indico.cern.ch/event/668017/contributions/2947027/
- https://indico.cern.ch/event/668017/contributions/2947028/Speakers: David Josef Schmidt (Rheinisch Westfaelische Tech. Hoch. (DE)), Thorben Quast (Rheinisch Westfaelische Tech. Hoch. (DE)), Jonas Glombitza (Rheinisch-Westfaelische Tech. Hoch. (DE)) -
10:00 AM
Generative Models for Fast Cluster Simulations in the TPC for the ALICE Experiment 20m
Simulating detector response for the Monte Carlo-generated
collisions is a key component of every high-energy physics experiment.
The methods used currently for this purpose provide high-fidelity re-
sults, but their precision comes at a price of high computational cost.
In this work, we present a proof-of-concept solution for simulating the
responses of detector clusters to particle collisions, using the real-life
example of the TPC detector in the ALICE experiment at CERN. An
essential component of the proposed solution is a generative model that
allows to simulate synthetic data points that bear high similarity to
the real data. Leveraging recent advancements in machine learning, we
propose to use state-of-the-art generative models, namely Variational
Autoencoders (VAE) and Generative Adversarial Networks (GAN), that
prove their usefulness and efficiency in the context of computer vision
and image processing.
The main advantage offered by those methods is a significant speed up
in the execution time, reaching up to the factor of 103 with respect to
the Geant 3. Nevertheless, this computational speedup comes at a price
of a lower simulation quality and in this work we show quantitative
and qualitative proofs of those limitations of generative models. We also
propose further steps that will allow to improve the quality of the models
and lead to their deployment in production environment of the TPC
detector.Speaker: Kamil Rafal Deja (Warsaw University of Technology (PL)) -
10:30 AM
Coffee break 30m
-
11:00 AM
Adversarial Tuning of Perturbative Parameters in Non-Differentiable Physics Simulators 20m
In this contribution, we present a method for tuning perturbative parameters in Monte Carlo simulation using a classifier loss in high dimensions. We use an LSTM trained on the radiation pattern inside jets to learn the parameters of the final state shower in the Pythia Monte Carlo generator. This represents a step forward compared to unidimensional distributional template-matching methods.
Speaker: Michela Paganini (Yale University (US)) -
11:25 AM
Fast calorimeter simulation in LHCb 20m
Fast calorimeter simulation in LHCb
In HEP experiments CPU resources required by MC simulations are constantly growing and become a very large fraction of the total computing power (greater than 75%). At the same time the pace of performance improvements from technology is slowing down, so the only solution is a more efficient use of resources. Efforts are ongoing in the LHC experiments to provide multiple options for simulating events in a faster way when higher statistics is needed. A key of the success for this strategy is the possibility of enabling fast simulation options in a common framework with minimal action by the final user. In this talk we will describe the solution adopted in Gauss, the LHCb simulation software framework, to selectively exclude particles from being simulated by the Geant4 toolkit and to insert the corresponding hits generated in a faster way. The approach, integrated within the Geant4 toolkit, has been applied to the LHCb calorimeter but it could also be used for other subdetectors. The hits generation can be carried out by any external tool, e.g. by a static library of showers or more complex machine-learning techniques. In LHCb generative models, which are nowadays widely used for computer vision and image processing are being investigated in order to accelerate the generation of showers in the calorimeter. These models are based on maximizing the likelihood between reference samples and those produced by a generator. The two main approaches are Generative Adversarial Networks (GAN), that takes into account an explicit description of the reference, and Variational Autoencoders (VAE), that uses latent variables to describe them. We will present how GAN approach can be applied to the LHCb calorimeter simulation, its advantages and drawbacks.
Speaker: Egor Zakharov -
11:50 AM
A Deep Learning tool for fast simulation 20m
Machine Learning techniques have been used in different applications by the HEP community: in this talk, we discuss the case of detector simulation. The need for simulated events, expected in the future for LHC experiments and their High Luminosity upgrades, is increasing dramatically and requires new fast simulation solutions. We will describe an R&D activity, aimed at providing a configurable tool capable of training a neural network to reproduce the detector response and replace standard Monte Carlo simulation. This represents a generic approach in the sense that such a network could be designed and trained to simulate any kind of detector and, eventually, the whole data processing chain in order to get, directly in one step, the final reconstructed quantities, in just a small fraction of time. We will present the first application of three-dimensional convolutional Generative Adversarial Networks to the simulation of high granularity electromagnetic calorimeters. We will describe detailed validation studies comparing our results to standard Monte Carlo simulation, showing, in particular, the very good agreement we obtain for high level physics quantities and detailed calorimeter response. We will show the computing resources needed to train such networks and the implementation of a distributed adversarial training strategy (based on data parallelism). Finally we will discuss how we plan to generalize our model in order to simulate a whole class of calorimeters, opening the way to a generic machine learning based fast simulation approach.
Speaker: Gul Rukh Khattak (University of Peshawar (PK))
-
9:00 AM
-
12:30 PM
→
2:00 PM
Lunch break 1h 30m
-
2:00 PM
→
6:30 PM
Conveners: Lorenzo Moneta (CERN), Markus Stoye (CERN)
-
2:00 PM
Tutorial: Keras/TensorFlow 1hSpeaker: Stefan Wunsch (KIT - Karlsruhe Institute of Technology (DE))
-
3:00 PM
Interpreting Deep Neural Networks and their Predictions 1h
Invited talk, http://iphome.hhi.de/samek/
Speaker: Wojciech Samek (Fraunhofer HHI) -
4:00 PM
Coffee break 30m
-
4:30 PM
What is the machine learning. 20m
Applications of machine learning tools to problems of physical interest are often criticized for producing sensitivity at the expense of transparency. In this talk, I explore a procedure for identifying combinations of variables -- aided by physical intuition -- that can discriminate signal from background. Weights are introduced to smooth away the features in a given variable(s). New networks are then trained on this modified data. Observed decreases in sensitivity diagnose the variable's discriminating power. Planing also allows the investigation of the linear versus non-linear nature of the boundaries between signal and background. I will demonstrate these features in both an easy to understand toy model and an idealized LHC resonance scenario.
Speaker: Bryan Ostdiek (University of Oregon) -
4:55 PM
Identifying the relevant dependencies of the neural network response on characteristics of the input space 20m
The use of neural networks in physics analyses poses new challenges for the estimation of systematic uncertainties. Since the key to a proper estimation of uncertainties is the precise understanding of the algorithm, novel methods for the detailed study of the trained neural network are valuable.
This talk presents an approach to identify those characteristics of the neural network inputs that are most relevant for the response and therefore provides essential information to determine the systematic uncertainties.Speaker: Stefan Wunsch (KIT - Karlsruhe Institute of Technology (DE)) -
5:20 PM
Studies to mitigate difference between real data and simulation for jet tagging 20m
The aim of the studies presented is to improve the performance of jet flavour tagging on real data while still exploiting a simulated dataset for the learning of the main classification task. In the presentation we explore “off the shelf” domain adaptation techniques as well as customised additions to them. The latter improves the calibration of the tagger, potentially leading to smaller systematic uncertainties. The studies are performed with simplified simulations for the case of b-jet tagging. The presentation will include first results as well as discuss pitfalls that we discovered during our research.
Speakers: Markus Stoye (CERN), Mauro Verzetti (CERN), Jan Kieseler (CERN), Arabella Martelli (Imperial College (GB)), Oliver Buchmuller (Imperial College (GB)) -
5:45 PM
Particle identification at LHCb: new calibration techniques and machine learning classification algorithms 20m
Particle identification (PID) plays a crucial role in LHCb analyses. Combining information from LHCb subdetectors allows one to distinguish between various species of long-lived charged and neutral particles. PID performance directly affects the sensitivity of most LHCb measurements. Advanced multivariate approaches are used at LHCb to obtain the best PID performance and control systematic uncertainties. This talk highlights recent developments in PID that use innovative machine learning techniques, as well as novel data-driven approaches which ensure that PID performance is well reproduced in simulation.
Speaker: Miriam Lucio Martinez (Universidade de Santiago de Compostela (ES))
-
2:00 PM
-
9:00 AM
→
12:30 PM
-
-
9:00 AM
→
12:30 PM
Conveners: Paul Seyfert (CERN), Rudiger Haake (CERN)
-
9:00 AM
Daily announcements 5mSpeakers: Lorenzo Moneta (CERN), Markus Stoye (CERN), Paul Seyfert (CERN), Rudiger Haake (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
-
9:05 AM
Fisher information metrics for binary classifier evaluation and training 20m
Different evaluation metrics for binary classifiers are appropriate to different scientific domains and even to different problems within the same domain. This presentation focuses on the optimisation of event selection to minimise statistical errors in HEP parameter estimation, a problem that is best analysed in terms of the maximisation of Fisher information about the measured parameters. After describing a general formalism to derive evaluation metrics based on Fisher information, three more specific metrics are introduced for the measurements of signal cross sections in counting experiments (FIP1) or distribution fits (FIP2) and for the measurements of other parameters from distribution fits (FIP3). The FIP2 metric is particularly interesting because it can be derived from any ROC curve, provided that prevalence is also known. In addition to its relation to measurement errors when used as an evaluation criterion (which makes it more interesting that the ROC AUC), a further advantage of the FIP2 metric is that it can also be directly used for training decision trees (instead of the Shannon entropy or Gini coefficient). Preliminary results based on the Python sklearn framework are presented. The problem of overtraining for these classifiers is also briefly discussed, in terms of the difference of the FIP2 metric on the validation and training set, and of their difference from the theoretical limit. Finally, the expected Fisher information gain from completely random branch splits in the decision tree and its possible relevance in reducing overtraining is analysed.
Speaker: Andrea Valassi (CERN) -
9:30 AM
DeepJet: a portable ML environment for HEP 20m
In this presentation we will detail the evolution of the DeepJet python environment. Initially envisaged to support the development of the namesake jet flavour tagger in CMS, DeepJet has grown to encompass multiple purposes within the collaboration. The presentation will describe the major features the environment sports: simple out-of-memory training with a multi-treaded approach to maximally exploit the hardware acceleration, simple and streamlined I/O to help bookkeeping of the developments, and finally docker image distribution, to simplify the deployment of the whole ecosystem on multiple datacenters. The talk will also cover future development, mainly aimed at improving user experience.
Speakers: Swapneel Sundeep Mehta (Dwarkadas J Sanghvi College of Engineering (IN)), Mr swapneel mehta (IT/DB Group) -
9:55 AM
Event Categorization using Deep Neural Networks for ttH (H→bb) with the CMS Experiment 20m
The analysis of top-quark pair associated Higgs boson production enables a direct measurement of the top-Higgs Yukawa coupling. In ttH (H→bb) analyses, multiple event categories are commonly used in order to simultaneously constrain signal and background contributions during a fit to data. A typical approach is to categorize events according to both their jet and b-tag multiplicities. The performance of this procedure is limited by the b-tagging efficiency and diminishes for events with high b-tag multiplicity such as in ttH (H→bb).
Machine learning algorithms provide an alternative method of event categorization. A promising choice for this kind of multi-class classification applications are deep neural networks (DNNs). In this talk, we present a categorization scheme using DNNs that is based on the underlying physics processes of events in the semi-leptonic ttH (H→bb) decay channel. Furthermore, we discuss different methods employed for improving the network’s categorization performance.Speaker: Marcel Rieger (RWTH Aachen University (DE)) -
10:30 AM
Coffee break 30m
-
11:00 AM
Machine learning in jet physics 20m
High energy collider experiments produce several petabytes of data every year. Given the magnitude and complexity of the raw data, machine learning algorithms provide the best available platform to transform and analyse these data to obtain valuable insights to understand Standard Model and Beyond Standard Model theories. These collider experiments produce both quark and gluon initiated hadronic jets as the core components. Deep learning techniques enable us to classify quark/gluon jets through image recognition and help us to differentiate signals and backgrounds in Beyond Standard Model searches at LHC. We are currently working on quark/gluon jet classification and progressing in our studies to find the bias between event generators using domain adversarial neural networks (DANN). We also plan to investigate top tagging, weak supervision on mixed samples in high energy physics, utilizing transfer learning from simulated data to real experimental data.
Speaker: Sreedevi Narayana Varma (King's College London) -
11:25 AM
Recursive Neural Networks in Quark/Gluon Tagging 20m
Vidyo contribution
Based on the natural tree-like structure of jet sequential clustering, the recursive neural networks (RecNNs) embed jet clustering history recursively as in natural language processing. We explore the performance of RecNN in quark/gluon discrimination. The results show that RecNNs work better than the baseline BDT by a few percent in gluon rejection at the working point of 50\% quark acceptance. We also experimented on some relevant aspects which might influence the performance of networks. It shows that even only particle flow identification as input feature without any extra information on momentum or angular position is already giving a fairly good result, which indicates that most of the information for q/g discrimination is already included in the tree-structure itself.
Speaker: Taoli Cheng (University of Chinese Academy of Sciences)
-
9:00 AM
-
12:30 PM
→
2:00 PM
Lunch break 1h 30m
-
2:00 PM
→
6:30 PM
Conveners: Rudiger Haake (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
-
3:05 PM
Direct Learning of Systematics-Aware Summary Statistics 20m
Complex machine learning tools, such as deep neural networks and gradient boosting algorithms, are increasingly being used to construct powerful discriminative features for High Energy Physics analyses. These methods are typically trained with simulated or auxiliary data samples by optimising some classification or regression surrogate objective. The learned feature representations are then used to build a sample-based statistical model to perform inference (e.g. interval estimation or hypothesis testing) over a set of parameters of interest. However, the effectiveness of the mentioned approach can be reduced by the presence of known uncertainties that cause differences between training and experimental data, included in the statistical model via nuisance parameters. This work presents an end-to-end algorithm, which leverages on existing deep learning technologies but directly aims to produce inference-optimal sample-summary statistics. By including the statistical model and a differentiable approximation of the effect of nuisance parameters in the computational graph, loss functions derived form the observed Fisher information are directly optimised by stochastic gradient descent. This new technique leads to summary statistics that are aware of the known uncertainties and maximise the information that can be inferred about the parameters of interest object of a experimental measurement.
Speaker: Pablo De Castro Manzano (Universita e INFN, Padova (IT)) -
3:30 PM
Multivariate Analysis Techniques for charm reconstruction with ALICE 20m
ALICE is the experiment at the LHC dedicated to heavy-ion collisions. One of the key tools to investigate the strongly-interacting medium (Quark-Gluon Plasma, QGP) formed in heavy-ion collisions is the measurement of open-charm particle production. In particular, charmed baryons, such as ΛC, provide essential information for the understanding of charm thermalisation and hadronisation in the QGP. Data from proton-proton and proton-Pb collisions are needed as a reference for interpreting the results in Pb-Pb collisions, as well as to study charm hadronisation into baryons "in-vacuum". The relatively short lifetime of the ΛC baryon, cτ~60μm, makes the reconstruction of its decay a challenging task that profits from the excellent performance of ALICE in terms of secondary vertex reconstruction and particle identification. The application of multivariateanalysis (MVA) techniques through Boosted Decision Trees can facilitate the separation of the ΛC signal from the background, and as such be a complementary approach to the more standard technique based on topological and kinematical cuts. In this contribution, the analysis and results of the ΛC -baryon production with MVA in pp collisions at √s = 7 TeV and in p-Pb collisions at √sNN = 5.02 TeV will be shown.
Speaker: Chiara Zampolli (CERN) -
4:00 PM
Coffee break 30m
-
4:30 PM
Classification of decays involving variable decay chains with convolutional architectures 20m
Vidyo contribution
We present a technique to perform classification of decays that exhibit decay chains involving a variable number of particles, which include a broad class of $B$ meson decays sensitive to new physics. The utility of such decays as a probe of the Standard Model is dependent upon accurate determination of the decay rate, which is challenged by the combinatorial background arising in high-multiplicity decay modes. In our model, each particle in the decay event is represented as a fixed-dimensional vector of feature attributes, forming an $n \times k$ representation of the event, where $n$ is the number of particles in the event and $k$ is the dimensionality of the feature vector. A convolutional architecture is used to capture dependencies between the embedded particle representations and perform the final classification. The proposed model performs outperforms standard machine learning approaches based on Monte Carlo studies across a range of variable final-state decays with the Belle II detector.
Speaker: Justin Tan (The University of Melbourne, Belle II) -
4:55 PM
Drones: Making faster and smarter decisions with software triggers 20m
Data collection rates in high energy physics (HEP), particularly those at the Large Hadron Collider (LHC) are a continuing challenge and require large amounts of computing power to handle. For example, at LHCb an event rate of 1 MHz is processed in a software-based trigger. The purpose of this trigger is to reduce the output data rate to manageable levels, which amounts to a reduction from 60 GB per second to an output data rate of 0.6 GB per second. Machine learning (ML) is becoming an evermore important tool in the data reduction, be it with the identification of interesting event topologies, or the distinction between individual particle species. For the case of LHCb data-taking, over 600 unique signatures are searched for in parallel in real time, each with its own set of requirements. However, only a handful at present make use of machine learning, despite the large ecosystem. Often the reason for this is the relative difficulty in the application of a preferred ML classifier to the C++/Python combination of event selection frameworks. One way to overcome this is to use an approximate network known as a drone that can learn the features of your preferred form and can be executed in an easily parallelisable way. We present the uses and advantages of such an approach.
Speaker: Sean Benson (Nikhef National institute for subatomic physics (NL)) -
5:20 PM
Close-out and challenge results 20mSpeakers: Lorenzo Moneta (CERN), Markus Stoye (CERN), Paul Seyfert (CERN), Rudiger Haake (CERN), Steven Randolph Schramm (Universite de Geneve (CH))
-
3:05 PM
-
9:00 AM
→
12:30 PM
-
-
9:00 AM
→
6:00 PM
Hackathon 3179
3179
CERN
Idea SquareConvener: Paul Seyfert (CERN)
-
9:00 AM
→
6:00 PM