IML Machine Learning Working Group - summary of summer conferences

Europe/Zurich
4/3-006 - TH Conference Room (CERN)

4/3-006 - TH Conference Room

CERN

110
Show room on map

S. Gleyzer - ACAT summary
(incomplete summary here, please also see slides and conference homepage)

  • started of as a machine learning conference series in the past and became more general data analysis conference series. Now returned to focus on ML.
  • many applications of ML:
    • end-to-end learning (go directly from detector response to event class, w/o reconstruction)
    • GANs for simulation (faster than running full GEANT simulations)
    • track reconstruction (CNN in trigger, RNN for tracking)
    • pile up removal with CNN to correct pile up in jet image
    • flavour tagging
    • trigger applications (lookup table for pT regression BDT, cf. last IML meeting, cat boost)
    • particle identification with XGBoost
  • Analysis tools
    • histogrammar by diana: parallelize event loop with spark
    • Root's TDataFrame: use declarative language rather than low level event loop, let framework optimize evaluations, caching, …
    • Root improvements in I/O (compression, parallelization), new TMVA features, web technology for graphics.
  • Questions
    • Was there anything on reinforcement learning?
      • A. possibly in one of the posters
  • Comment (Sergei): In the summary I left out the GAN for simulation talk by S. Vallecorsa.


T. Golling - Hammers and nails
(incomplete summary here, please also see slides and conference homepage)

  • Title "Hammers and nails" refers to Machine Learning as a big hammer technology to tackle various problems, nails being the particle physics problems
  • minutes kept on google docs https://docs.google.com/document/d/1y7cE8qVp6xyKdlWzaBqD81_Otvf5YPhHcjlybtURKbU/edit#
  • with break out sessions on white/black boards
  • ideas for new concepts in HEP:
    • use GAN and VAE to infer w/o likelihood building in between
  • Adversarial examples
    • image recognition algorithms can "easily" be tricked into misclassifying an image (without changing it to the human viewer)
    • usable to trick voice recognition to transfer money
    • examples are "crafted", arguable if a detector would spit them out in HEP
  • discussion on autonomous driving
    • should AI be taught the rules of the road or learn them just by observation
    • (what about interaction with human drivers and when it's okay to break the rules)
  • probabilistic programming
    • goal is to learn a generator which learned its rules.
    • examples of generators which can generate "normal looking trees" or "stable structures"
  • Questions:
    • It seems there was very little on classical signal/background classification (Enrico Guiraud)
      • That was a feature of the topic selection for the summary.
    • The classical examples of how to generate adversarial inputs require internal knowledge of the NN that is "attacked" shouldn't that hinder fraud
      • That was discussed, it appears adversarial examples are highly portable in the sense that an adversarial example for one network
      • will also be adversarial for different networks for the same task


G. Kasiecka – Boost summary
(incomplete summary here, please also see slides and conference homepage) 

  • Boost is a workshop about advances in jet physics, e.g. heavy object tagging
  • See also the machine learning slides in the boost exp. summary talk: https://indico.cern.ch/event/579660/contributions/2496143/attachments/1496921/2330050/Boost17_ExpSum.pdf
  • Many talks about classification, but also one each on GANs and pileup
  • Classification contributions (see slides for details and references)
    • "color" jet image for quark/gluon jet classifications 
      • "Colors":
        • charged pt
        • neutral pt
        • charged particle multiplicity
      • Deep NN and colors improve significantly at high pt (~ 1 TeV)
      • The method is stable if trained on Herwig or Pythia.
    • q/g tagging in ATLAS
      • Similar technique as in the previous contribution, applied to ATLAS.
      • But some dependence on the MC used for training seen here
    • b-tagging
      • new idea: look at changes in hits multiplicity between subsequent layers in the tracking detector
    • Machine Learning in CMS
      • Comprehensive talk, but one highlight selected for this summary: 
      • the DeepJet architecture
        • multi-class classification using recurrent networks chained to a fully connected one
    • Top-taggers
      • Use constituents 4-vectors, but keeping some of the spirit of the image approach
      • Similar performance to the image approach with calorimeters, but performs better than images with particle flow: no need to pixelate.
    • Boosted top and WW with Atlas
  • GAN for jet simulation
  • Pile-up mitigation with Machine learning
    • Color image used to estimate jet variables removing contributed
      • pt of all neutral particles
      • pt of charged from primary vertex
      • pt of charged from secondary vertex
      • Outputs a single channel image: pT of neutral particles from primary vertex
  • Questions
      • Where there news on Boosted Higgs? discussions, but no major update with ML
      • Universal multi-class classification? in progress, mentioned in some contributions
      • What a ML expert with no experience in physics should do? Re-use a publicly available data set is a good way to start
      • Trainable linear combinations in the Deep learning top tagger, why? Different weighted sum can give access to e.g. the mass of the top or reconstruct the W


M. Paganini - CVPR Summary
(incomplete summary here, please also see slides and conference homepage)

  • It's mostly a computer vision conference, organized by IEEE, over 4000 participants
  • Selected contributions potentially relevant for HEP presented in the summary
  • Some interesting trends discussed in slide 5
  • DenseNets
    • Resnet: allow to skip any one layer in the network
    • Densenets: 
      • allow to connect any layer with any other layer (not just skipping one)
      • size of the layers grows at each step
      • bottleneck introduced to reduce feature sizes
      • Advantages (See slide 14 for complete list)
        • strong gradients
        • All features learned are used for prediction, simple features in the first layer survive to the last layer.
    • Slide 18: Using Knowledge Graphs for classifications
      • idea is that ML method identifies elements in the images, but ultimate object which we want to identify is not any  one of these particular elements, 
      • if we have a graph which maps the relation between elements and the final class, we may be able to identify the object we want to tag better
      • Example: want to identify some heavy resonance, but our tagger will identify daughters of this object
      • Use knowledge graph to infer class from tagged elements

 

There are minutes attached to this event. Show them.