IML Machine Learning Working Group - summary of summer conferences

Name: IML Machine Learning Working Group - summary of summer conferences
Start: 2017-09-06T15:30:00+02:00
End: 2017-09-06T18:05:00+02:00
Location: CERN

Wednesday 6 Sept 2017, 15:30 → 18:05 Europe/Zurich

4/3-006 - TH Conference Room (CERN)

4/3-006 - TH Conference Room

CERN

110

Show room on map

Hide

S. Gleyzer - ACAT summary
(incomplete summary here, please also see slides and conference homepage)

started of as a machine learning conference series in the past and became more general data analysis conference series. Now returned to focus on ML.
many applications of ML:
- end-to-end learning (go directly from detector response to event class, w/o reconstruction)
- GANs for simulation (faster than running full GEANT simulations)
- track reconstruction (CNN in trigger, RNN for tracking)
- pile up removal with CNN to correct pile up in jet image
- flavour tagging
- trigger applications (lookup table for pT regression BDT, cf. last IML meeting, cat boost)
- particle identification with XGBoost
Analysis tools
- histogrammar by diana: parallelize event loop with spark
- Root's TDataFrame: use declarative language rather than low level event loop, let framework optimize evaluations, caching, …
- Root improvements in I/O (compression, parallelization), new TMVA features, web technology for graphics.
Questions
- Was there anything on reinforcement learning?
  - A. possibly in one of the posters
Comment (Sergei): In the summary I left out the GAN for simulation talk by S. Vallecorsa.

T. Golling - Hammers and nails
(incomplete summary here, please also see slides and conference homepage)

Title "Hammers and nails" refers to Machine Learning as a big hammer technology to tackle various problems, nails being the particle physics problems
minutes kept on google docs https://docs.google.com/document/d/1y7cE8qVp6xyKdlWzaBqD81_Otvf5YPhHcjlybtURKbU/edit#
with break out sessions on white/black boards
ideas for new concepts in HEP:
- use GAN and VAE to infer w/o likelihood building in between
Adversarial examples
- image recognition algorithms can "easily" be tricked into misclassifying an image (without changing it to the human viewer)
- usable to trick voice recognition to transfer money
- examples are "crafted", arguable if a detector would spit them out in HEP
discussion on autonomous driving
- should AI be taught the rules of the road or learn them just by observation
- (what about interaction with human drivers and when it's okay to break the rules)
probabilistic programming
- goal is to learn a generator which learned its rules.
- examples of generators which can generate "normal looking trees" or "stable structures"
Questions:
- It seems there was very little on classical signal/background classification (Enrico Guiraud)
  - That was a feature of the topic selection for the summary.
- The classical examples of how to generate adversarial inputs require internal knowledge of the NN that is "attacked" shouldn't that hinder fraud
  - That was discussed, it appears adversarial examples are highly portable in the sense that an adversarial example for one network
  - will also be adversarial for different networks for the same task

G. Kasiecka – Boost summary
(incomplete summary here, please also see slides and conference homepage)

Boost is a workshop about advances in jet physics, e.g. heavy object tagging
See also the machine learning slides in the boost exp. summary talk: https://indico.cern.ch/event/579660/contributions/2496143/attachments/1496921/2330050/Boost17_ExpSum.pdf
Many talks about classification, but also one each on GANs and pileup
Classification contributions (see slides for details and references)
- "color" jet image for quark/gluon jet classifications
  - "Colors":
    - charged pt
    - neutral pt
    - charged particle multiplicity
  - Deep NN and colors improve significantly at high pt (~ 1 TeV)
  - The method is stable if trained on Herwig or Pythia.
- q/g tagging in ATLAS
  - Similar technique as in the previous contribution, applied to ATLAS.
  - But some dependence on the MC used for training seen here
- b-tagging
  - new idea: look at changes in hits multiplicity between subsequent layers in the tracking detector
- Machine Learning in CMS
  - Comprehensive talk, but one highlight selected for this summary:
  - the DeepJet architecture
    - multi-class classification using recurrent networks chained to a fully connected one
- Top-taggers
  - Use constituents 4-vectors, but keeping some of the spirit of the image approach
  - Similar performance to the image approach with calorimeters, but performs better than images with particle flow: no need to pixelate.
- Boosted top and WW with Atlas
GAN for jet simulation
- See also: https://indico.cern.ch/event/595059/contributions/2497383/
- CaloGAN to simulate calorimeter response
Pile-up mitigation with Machine learning
- Color image used to estimate jet variables removing contributed
  - pt of all neutral particles
  - pt of charged from primary vertex
  - pt of charged from secondary vertex
  - Outputs a single channel image: pT of neutral particles from primary vertex
Questions
- - Where there news on Boosted Higgs? discussions, but no major update with ML
  - Universal multi-class classification? in progress, mentioned in some contributions
  - What a ML expert with no experience in physics should do? Re-use a publicly available data set is a good way to start
  - Trainable linear combinations in the Deep learning top tagger, why? Different weighted sum can give access to e.g. the mass of the top or reconstruct the W

M. Paganini - CVPR Summary
(incomplete summary here, please also see slides and conference homepage)

It's mostly a computer vision conference, organized by IEEE, over 4000 participants
Selected contributions potentially relevant for HEP presented in the summary
Some interesting trends discussed in slide 5
DenseNets
- Resnet: allow to skip any one layer in the network
- Densenets:
  - allow to connect any layer with any other layer (not just skipping one)
  - size of the layers grows at each step
  - bottleneck introduced to reduce feature sizes
  - Advantages (See slide 14 for complete list)
    - strong gradients
    - All features learned are used for prediction, simple features in the first layer survive to the last layer.
- Slide 18: Using Knowledge Graphs for classifications
  - idea is that ML method identifies elements in the images, but ultimate object which we want to identify is not any one of these particular elements,
  - if we have a graph which maps the relation between elements and the final class, we may be able to identify the object we want to tag better
  - Example: want to identify some heavy resonance, but our tagger will identify daughters of this object
  - Use knowledge graph to infer class from tagged elements

There are minutes attached to this event. Show them.

- 15:30 → 15:40
  
  News and group updates 10m
  
  Speakers: Lorenzo Moneta (CERN), Michele Floris (CERN), Paul Seyfert (CERN), Dr Sergei Gleyzer (University of Florida (US)), Steven Randolph Schramm (Universite de Geneve (CH))
  
  IML_news_September2017.pdf
- 15:40 → 16:00
  
  ACAT summary (HEP-ML perspective) 20m
  
  Speaker: Sergei Gleyzer (University of Florida (US))
  
  ACAT_Summary_IML_Sergei.pdf
  
  Conference link
- 16:00 → 16:20
  
  Hammers and Nails summary (HEP-ML perspective) 20m
  
  Speaker: Tobias Golling (Universite de Geneve (CH))
  
  Conference link
  
  IML_HammersAndNails_Sep062017.pdf
- 16:20 → 16:40
  
  BOOST summary (HEP-ML perspective) 20m
  
  Speaker: Gregor Kasieczka (Eidgenoessische Technische Hochschule Zuerich (CH))
  
  BOOST_Summary_IML.pdf
  
  Conference link
- 16:40 → 17:00
  
  CVPR summary (HEP-ML perspective) 20m
  
  Speaker: Michela Paganini (Yale University (US))
  
  Conference link
  
  summerconf_summary.pdf

Choose timezone

IML Machine Learning Working Group - summary of summer conferences

4/3-006 - TH Conference Room

CERN