IML Machine Learning Working Group - summary of summer conferences

4-3-006 - TH Conference Room (CERN)

4-3-006 - TH Conference Room


Show room on map
Videoconference Rooms
Vidyo room for the IML machine learning working group
Steven Schramm
Auto-join URL
Useful links
Phone numbers

S. Gleyzer - ACAT summary
(incomplete summary here, please also see slides and conference homepage)

  • started of as a machine learning conference series in the past and became more general data analysis conference series. Now returned to focus on ML.
  • many applications of ML:
    • end-to-end learning (go directly from detector response to event class, w/o reconstruction)
    • GANs for simulation (faster than running full GEANT simulations)
    • track reconstruction (CNN in trigger, RNN for tracking)
    • pile up removal with CNN to correct pile up in jet image
    • flavour tagging
    • trigger applications (lookup table for pT regression BDT, cf. last IML meeting, cat boost)
    • particle identification with XGBoost
  • Analysis tools
    • histogrammar by diana: parallelize event loop with spark
    • Root's TDataFrame: use declarative language rather than low level event loop, let framework optimize evaluations, caching, …
    • Root improvements in I/O (compression, parallelization), new TMVA features, web technology for graphics.
  • Questions
    • Was there anything on reinforcement learning?
      • A. possibly in one of the posters
  • Comment (Sergei): In the summary I left out the GAN for simulation talk by S. Vallecorsa.

T. Golling - Hammers and nails
(incomplete summary here, please also see slides and conference homepage)

  • Title "Hammers and nails" refers to Machine Learning as a big hammer technology to tackle various problems, nails being the particle physics problems
  • minutes kept on google docs
  • with break out sessions on white/black boards
  • ideas for new concepts in HEP:
    • use GAN and VAE to infer w/o likelihood building in between
  • Adversarial examples
    • image recognition algorithms can "easily" be tricked into misclassifying an image (without changing it to the human viewer)
    • usable to trick voice recognition to transfer money
    • examples are "crafted", arguable if a detector would spit them out in HEP
  • discussion on autonomous driving
    • should AI be taught the rules of the road or learn them just by observation
    • (what about interaction with human drivers and when it's okay to break the rules)
  • probabilistic programming
    • goal is to learn a generator which learned its rules.
    • examples of generators which can generate "normal looking trees" or "stable structures"
  • Questions:
    • It seems there was very little on classical signal/background classification (Enrico Guiraud)
      • That was a feature of the topic selection for the summary.
    • The classical examples of how to generate adversarial inputs require internal knowledge of the NN that is "attacked" shouldn't that hinder fraud
      • That was discussed, it appears adversarial examples are highly portable in the sense that an adversarial example for one network
      • will also be adversarial for different networks for the same task

G. Kasiecka – Boost summary
(incomplete summary here, please also see slides and conference homepage) 

  • Boost is a workshop about advances in jet physics, e.g. heavy object tagging
  • See also the machine learning slides in the boost exp. summary talk:
  • Many talks about classification, but also one each on GANs and pileup
  • Classification contributions (see slides for details and references)
    • "color" jet image for quark/gluon jet classifications 
      • "Colors":
        • charged pt
        • neutral pt
        • charged particle multiplicity
      • Deep NN and colors improve significantly at high pt (~ 1 TeV)
      • The method is stable if trained on Herwig or Pythia.
    • q/g tagging in ATLAS
      • Similar technique as in the previous contribution, applied to ATLAS.
      • But some dependence on the MC used for training seen here
    • b-tagging
      • new idea: look at changes in hits multiplicity between subsequent layers in the tracking detector
    • Machine Learning in CMS
      • Comprehensive talk, but one highlight selected for this summary: 
      • the DeepJet architecture
        • multi-class classification using recurrent networks chained to a fully connected one
    • Top-taggers
      • Use constituents 4-vectors, but keeping some of the spirit of the image approach
      • Similar performance to the image approach with calorimeters, but performs better than images with particle flow: no need to pixelate.
    • Boosted top and WW with Atlas
  • GAN for jet simulation
  • Pile-up mitigation with Machine learning
    • Color image used to estimate jet variables removing contributed
      • pt of all neutral particles
      • pt of charged from primary vertex
      • pt of charged from secondary vertex
      • Outputs a single channel image: pT of neutral particles from primary vertex
  • Questions
      • Where there news on Boosted Higgs? discussions, but no major update with ML
      • Universal multi-class classification? in progress, mentioned in some contributions
      • What a ML expert with no experience in physics should do? Re-use a publicly available data set is a good way to start
      • Trainable linear combinations in the Deep learning top tagger, why? Different weighted sum can give access to e.g. the mass of the top or reconstruct the W

M. Paganini - CVPR Summary
(incomplete summary here, please also see slides and conference homepage)

  • It's mostly a computer vision conference, organized by IEEE, over 4000 participants
  • Selected contributions potentially relevant for HEP presented in the summary
  • Some interesting trends discussed in slide 5
  • DenseNets
    • Resnet: allow to skip any one layer in the network
    • Densenets: 
      • allow to connect any layer with any other layer (not just skipping one)
      • size of the layers grows at each step
      • bottleneck introduced to reduce feature sizes
      • Advantages (See slide 14 for complete list)
        • strong gradients
        • All features learned are used for prediction, simple features in the first layer survive to the last layer.
    • Slide 18: Using Knowledge Graphs for classifications
      • idea is that ML method identifies elements in the images, but ultimate object which we want to identify is not any  one of these particular elements, 
      • if we have a graph which maps the relation between elements and the final class, we may be able to identify the object we want to tag better
      • Example: want to identify some heavy resonance, but our tagger will identify daughters of this object
      • Use knowledge graph to infer class from tagged elements


There are minutes attached to this event. Show them.
Your browser is out of date!

Update your browser to view this website correctly. Update my browser now