(CERN), Michele Floris
(CERN), Paul Seyfert
(Universita & INFN, Milano-Bicocca (IT)), Sergei Gleyzer
(University of Florida (US)), Steven Randolph Schramm
(Universite de Geneve (CH))
Inter-experimental Machine Learning Working Group Workshop on Machine Learning will be held during March 20-22, 2017. The IML workshop will have dedicated sessions on machine-learning software and tools, tutorials, hands-on challenge, industry-HEP session, and a dedicated mini-workshop on physics object identification (tagging).
(University of Florida (US)), Steven Randolph Schramm
(Universite de Geneve (CH))
Introduction to the workshop
(University of Florida (US)), Lorenzo Moneta
(CERN), Michele Floris
(CERN), Paul Seyfert
(Universita & INFN, Milano-Bicocca (IT)), Steven Randolph Schramm
(Universite de Geneve (CH))
The jet reconstruction and the heavy jet flavour tagging at LHCb will be discussed with focus on the last published measurements such as the measurement of forward tt, W+bb and W+cc production in pp collisions at √s=8 TeV and the search for the SM Higgs boson decaying in bbbar or ccbar in association to W or Z boson.
(Universita e INFN, Padova (IT))
Identification of Jets Containing b-Hadrons with Recurrent Neural Networks at the ATLAS Experiment
A novel b-jet identification algorithm is constructed with a Recurrent Neural Network (RNN) at the ATLAS Experiment. This talk presents the expected performance of the RNN based b-tagging in simulated $t \bar t$ events. The RNN based b-tagging processes properties of tracks associated to jets which are represented in sequences. In contrast to traditional impact-parameter-based b-tagging algorithms which assume the tracks of jets are independent from each other, RNN based b-tagging can exploit the spatial and kinematic correlations of tracks which are initiated from the same b-hadrons. The neural network nature of the tagging algorithm also allows the flexibility of extending input features to include more track properties than can be effectively used in traditional algorithms.
Daniel Hay Guest
(University of California Irvine (US))
Exploring neural networks to improve b-jet tagging with the ALICE detector
Highly energetic jets are sensitive probes for the kinematics and the topology of nuclear collisions. Jets are collimated sprays of charged and neutral particles, which are produced in the fragmentation of hard scattered partons in an early stage of the collision. Heavy-quark jets, originating from beauty or charm quarks (b- and c-jets), are particularly good probes to shed light on the characteristics of the hot medium which is formed in heavy-ion collisions and to understand the parton energy loss in the medium. There exist several algorithms to tag b-jets. One approach is to identify b-jets by reconstructing displaced secondary vertices and applying rectangular cuts on their topology. Machine learning is a promising tool to perform better in such a classification task on similar input features. In particular, deep learning methods might be able to catch features from low-level parameters which are not exploited by the classical cut-based method. In this talk, first simulation results of a neural network based method to tag b-jets in p-Pb collisions at 5.02 TeV with the ALICE detector will be presented.
Flavour-tagging of jets is an important task in collider based high energy physics and a field where machine learning tools are applied by all major experiments. A new tagger (DeepFlavour) was developed and commissioned in CMS that is based on an advanced machine learning procedure. A deep neural network is used to do multi-classification of jets that origin from a b-quark, two b-quarks, a c-quark, two c-quarks or light colored particles (u, d, s-quark or gluon). The performance was measured in both, data and simulation. The talk will also include the measured performance of all taggers in CMS. The different taggers and results will be discussed and compared with some focus on details of the newest tagger.
Flavor Tagging with Deep Neural Networks at Belle II
The Belle II experiment is mainly designed to investigate the decay of B meson pairs from $\Upsilon(4S)$ decays, produced by the asymmetric electron-positron collider SuperKEKB. The determination of the B meson flavor, so-called flavor tagging, plays an important role in analyses and can be inferred in many cases directly from the final state particles. In this talk a successful approach of B meson flavor tagging utilizing a Deep Neural Network is presented. Monte Carlo studies show a significant improvement with respect to the established category-based flavor tagging algorithm.
One of the most important procedure needed for the study of CP violation in Beauty sector is the tagging of the flavour of neutral B-mesons at production. The harsh environment of the Large Hadron Collider makes it particularly hard to succeed in this task. We present a proposal to upgrade current flavour tagging strategy in LHCb experiment. This strategy consists of inclusive tagging ensemble methods (i.e: the use inclusive information about the event without a firm selection rule), which are combined using a probabilistic model for each event. The probabilistic model uses all reconstructed tracks and secondary vertices to obtain well-determined probability of B flavour at production. Such approach reduces the dependence on the performance of lower level identification capacities and thus has the potential to increase the overall performance.
(Yandex School of Data Analysis (RU))
Quark/gluon jet discrimination: a reproducible analysis using R
The power to discriminate between light-quark jets and gluon jets would have a huge impact on many searches for new physics at CERN and beyond. This talk will present a walk-through of the development of a prototype machine learning classifier for differentiating between quark and gluon jets at experiments like those at the Large Hadron Collider at CERN. A new fast feature selection method that combines information theory and graph analytics will be outlined. This method has found new variables that promise significant improvements in discrimination power. The prototype jet tagger is simple, interpretable, parsimonious, and computationally extremely cheap, and therefore might be suitable for use in trigger systems for real-time data processing. Nested stratified k-fold cross validation was used to generate robust estimates of model performance. The data analysis was performed entirely in the R statistical programming language, and is fully reproducible. The entire analysis workflow is data-driven, automated and runs on very modest hardware with no human intervention. New data visualisation techniques will also be introduced.
(Hungarian Academy of Sciences (HU))
Machine learning based on convolutional neural networks can be used to study jet images from the LHC. Top tagging in fat jets offers a well-defined framework to establish our DeepTop approach and compare its performance to QCD-based top taggers. We first optimize a network architecture to identify top quarks in Monte Carlo simulations of the Standard Model production channel. Using standard fat jets we then compare its performance to a multivariate QCD-based top tagger. We find that both approaches lead to comparable performance, establishing convolutional networks as a promising new approach for multivariate hypothesis-based top tagging.
(Eidgenoessische Technische Hochschule Zuerich (CH))
Recent literature on deep neural networks for top tagging has focussed on image based techniques or multivariate approaches using high level jet substructure variables. Here, we take a sequential approach to this task by using anordered sequence of energy deposits as training inputs. Unlike previous approaches, this strategy does not result in a loss of information during pixelization or the calculation of high level features. We also propose new preprocessing methods that do not alter key physical quantities such as jet mass. We compare the performance of this approach to standard tagging techniques and present results evaluating the robustness of the neural network to pileup.
(University of British Columbia (CA))
Decorrelated Jet Substructure Tagging using Adversarial Neural Networks
We describe a strategy for constructing a neural network jet substructure tagger which powerfully discriminates boosted decay signals while remaining largely uncorrelated with the jet mass. This reduces the impact of systematic uncertainties in background modeling while enhancing signal purity, resulting in improved discovery significance relative to existing taggers. The network is trained using an adversarial strategy, resulting in a tagger that learns to balance classification accuracy with decorrelation. As a benchmark scenario, we consider the case where large-radius jets originating from a boosted Z' decay are discriminated from a background of nonresonant quark and gluon jets. We show that in the presence of systematic uncertainties on the background rate, our adversarially-trained, decorrelated tagger considerably outperforms a conventionally trained neural network, despite having a slightly worse signal-background separation power. We generalize the adversarial training technique to include a parametric dependence on the signal hypothesis, training a single network that provides optimized, interpolatable decorrelated jet tagging across a continuous range of hypothetical resonance masses, after training on discrete choices of the signal mass.
Chase Owen Shimmin
(Yale University (US))
Deep Convolutional Networks for Event Reconstruction and Particle Tagging on NOvA and DUNE
Deep Convolutional Neural Networks (CNNs) have been widely applied in computer vision to solve complex problems in image recognition and analysis. In recent years many efforts have emerged to extend the use of this technology to HEP applications, including the Convolutional Visual Network (CVN), our implementation for identification of neutrino events. In this presentation I will describe the core concepts of CNNs, the details of our particular implementation in the Caffe framework and our application to identify NOvA events. NOvA is a long baseline neutrino experiment whose main goal is the measurement of neutrino oscillations. This relies on the accurate identification and reconstruction of the neutrino flavor in the interactions we observe. In 2016 the NOvA experiment released results for the observation of oscillations in the ν μ → ν e channel, the first HEP result employing CNNs. I will also discuss our approach at event identification on NOvA as well as recent developments in the application of CNNs for particle tagging at NOvA, event identification at DUNE and other ongoing work.
Application of Generative Adversarial Networks (GANs) to jet images
We provide a bridge between generative modeling in the Machine Learning community and simulated physical processes in High Energy Particle Physics by applying a novel Generative Adversarial Network (GAN) architecture to the production of jet images -- 2D representations of energy depositions from particles interacting with a calorimeter. We propose a simple architecture, the Location-Aware Generative Adversarial Network, that learns to produce realistic radiation patterns from simulated high energy particle collisions. The pixel intensities of GAN-generated images faithfully span over many orders of magnitude and exhibit the desired low-dimensional physical properties (i.e., jet mass, n-subjettiness, etc.). We shed light on limitations, and provide a novel empirical validation of image quality and validity of GAN-produced simulations of the natural world. This work provides a base for further explorations of GANs for use in faster simulation in High Energy Particle Physics.
Using Boosted Decision Trees to look for displaced Jets in the ATLAS Calorimeter
A boosted decision tree is used to identify unique jets in a recently released conference note describing a search for long lived particles decaying to hadrons in the ATLAS Calorimeter. Neutral Long lived particles decaying to hadrons are “typical” signatures in a lot of models including Hidden Valley models, Higgs Portal Models, Baryogenesis, Stealth SUSY, etc. Long lived neutral particles that decay in the calorimeter leave behind an object that looks like a regular Standard Model jet, with subtle differences. For example, the later in the calorimeter it decays, the less energy will be deposited in the early layers of the calorimeter. Because the jet does not originate at the interaction point, it will likely be more narrow as reconstructed by the standard Anti-kT jet reconstruction algorithm used by ATLAS. To separate the jets due to neutral long lived decays from the standard model jets we used a boosted decision tree with thirteen variables as inputs. We used the information from the boosted decision tree as input into a more traditional straight-cuts analysis to separate background and signal event topologies. We will describe the process by which we choose the variables for the boosted decision tree, “cleaned the data”, the tuning of the boosted decision tree, and the results in this talk. As far as we are aware this is the first time a multivariate technique has been used for object ID in a search for long lived particles.
(University of Washington (US))
Reconstruction of charged particle tracks is a central task in the processing of physics data at the LHC and other colliders. Current state-of-the-art tracking algorithms are based on the Kalman filter and have seen great success both offline and at trigger level. However, these algorithms scale poorly with increasing detector occupancy, and it is foreseen that significant changes will be needed to achieve efficient track reconstruction in very high luminosity conditions. The HEP.TrkX pilot project aims to develop and explore machine-learning-based algorithms for particle tracking, with the goal of identifying candidate techniques for a more scalable tracking algorithm. In this talk we will discuss the techniques explored in the project so far, with emphasis on algorithms based on recurrent and convolutional neural networks. We will demonstrate the performance of these algorithms on toy detector data, and discuss plans to adapt them into complete algorithms for seed-finding and/or full track reconstruction in a realistic detector environment.
Dustin James Anderson
(California Institute of Technology (US))
Object identification with deep learning using Intel DAAL on Knights Landing processor [Vidyo]
The problem of object recognition is computationally expensive, especially when large amounts of data is involved. Recently, techniques in deep neural networks (DNN) - including convolutional neural networks and residual neural networks - have shown great recognition accuracy compared to traditional methods (artificial neural networks, decision tress, etc.). However, experience reveals that there are still a number of factors that limit scientists from deriving the full performance benefits of large, DNNs. We summarize these challenges as follows: (1) large number of hyperparameters that have to be tuned against the DNN during training phase, leading to several data re-computations over a large design-space, (2) the share volume of data used for training, resulting in prolonged training time, (3) how to effectively utilize underlying hardware (compute, network and storage) to achieve maximum performance during this training phase. In this presentation, we discuss a cross-layer perspective into realizing efficient DNNs for classification of physics objects (in particular, Higgs). We describe how we compose hardware, software and algorithmic components to derive efficient and optimized DNN models that are not only efficient, but can also be rapidly re-purposed for other tasks, such as muon identification, or assignment of transverse momentum to these muons. This work is an extension of the previous work to design a generalized hardware-software framework that simplifies the usage of deep learning techniques in big data problems
David Nonso Ojika
(University of Florida (US))