**Sergei, News**

- Next meeting (January): multi-class and multi-objective classification/regression, catch-all, workshop reports
- Contrary to what we announced in November, the topic is not going to be a tagging. We plan to have a dedicated workshop (see below), due to the large number of proposed contributions

- Community White Paper (CWP) of Hep Software Foundation to be produced by sometime next summer, includes machine learning. IML should contribute to that.
- IML Workshop planned for March 20-22 (see slides for details)
- Tagging workshop (with hands-on) session
- CWP discussion
- Tutorials

**Enrico Guiraud, Generative models and EM algorithm **

- Introduction to generative models
- Main ingredients
- Latent variables (not observed, e.g. the model is made by 3 gaussians the parameters of the individual gaussians are the latent variables)
- Observed variables
- Model parameters

- Methematically: conditional probability of observed data depending on hidden states (latent variables) and parameters
- Can be treated as a likelihood maximization problem
- Directly maximizing the log likelihood very difficult/impossible due to the large sum over hidden states
- EM algorithm: workaround the problem maximizing the "free energy", obtained introducing variational distributions q(s)
- Details on the choice of the q(s) are discussed in the slides

- An example is shown with the "noisy-OR" model
- A more complex example is applied to the Mnist data set
- Recovers "digits"-like latents

- Difference with respect to other approaches: the model is explicit (you have to write a model in terms of hidden variables)
- Questions
- Kyle: Latent variables in HEP are Monte Carlo truth, but much more complex: millions of hidden variables in geant simulation
- Possibility to merge to kinds of generative models, non-explicit for simplified model of the detector, and EM minimization to get access to hidden variables
- Does not replace a full simulation, but it could provide insight on a few carefully chosen hidden variables

- Sergei: can think about using this for a fast simulation.

**Gilles Louppe, Learning to Pivot with adversaria networks**

- How to use a generative model to constrain a classifier
- One of the typical problems in phsyics: how to incorporate/treat systematic uncertainties coming from the model uncertainties
- Goal: find a classifier which is not sensitive to systematic variations of nuisance parameters
- Slide 4: it means finding a classifier f which is a "pivotal quantity"
- 2 networks:
- one is the classifier
- one is the adversary, which produces the posterior value of the nuisance parameters based on the output of the classifier
- If the adversary can produce a meaningful posterior of z it means that the classifier depends on the nuisance parameter
- Want to make the adversary very bad

- Details on the architecture shown in the slides
- Strategy: minimize a loss function built with the loss function of the classifier minus the loss function of the adversary
- (Proof on the mini-max optimization and minimization algorithm shown in the slides)
- Weight of the adversary can be controlled by a parameter λ, controls the trade off between accuracy and robustness

- Toy example discussed in the slides: two classes (gaussians) where the exact relative position is not known
- Shows that the method works and that the robustness comes at the price of poorer classification performance

- HEP-inspired example also shown (W and QCD jets discrimination)
- Nuisance: pileup (extreme cases of 0 or 50 pileup events considered)

- Slide 17: optimize λ with respect to some other objective, e.g. the final median statistical significance of the signal
- Questions
- Sergei Slide 18: what are the bands? Experiment repeated multiple times.
- interesting to observe that the treshold changes
- Yes, but this is effectively a different classifier.

- Sergei: What do the curves look like for 10< λ < 500, same general shape?
- Answer: looked at it, didn’t show but generally yes

- Sergei: did you try with multiple nuisance parameters? Not yet, but extension should be trivial
- Tatiana: What's the proportion of events with 0 and 50 pile-up events? 50-50
- It seems the result does not depend on whether the training is done with the Z=1 or Z=0 class
- Question deals with robustness of the result to various levels of pile-up

- Tatiana: Did you compare with the approach of Mike Williams (uBoost https://arxiv.org/abs/1305.7248)? no
- one difference: this method was developed for non-observable parameters.
- Subsequent discussion focused on whether the choice of pile-up as a nuisance parameter was justified.

**Mike Williams, Event generator tuning using Bayesian Optimization**

- Introduction on what Bayesian optimization is
- Investigate if it can be used for MC tuning -> generate posterior distributions for Pythia parameter based on data
- This method can automatically assign uncertainties to the parameters
- Use "Monash" as true data, see if they can recover Monash parameters via bayesian optimization
- Slide 3: if a parameter is not really constrained by a particular observation it correctly gets a huge error bar
- Convergence is relatively fast (order of 50 queries * nParameters)

- Global fit of all 20 parameters is less precise than "block" training, but still very satisfactory
- Potential improvements discussed on the slides: expert knowledge, knowledge transfer, pre-simulation of small samples, treatment of discrete parameters, extension to larger parameter spaces
- Questions
- Sergei : bayesian optimization of ML hyperparameters, is there anybody who tried that?
- Baldi et. al.
- Some other people did it, for instance there is an entry on Tim Head's blog

- Sergei: Does the tool (spearmint) work in parallel? Yes, but few choiches on how parallelize (e.g. pythia -> trivial parallelization). The tool allows for parallel optimization with different data set, useful if you have distributed resources.
- Sergei: Next step? Will you try this with real data?
- Yes, will be tested on real data. Was tested on ee, finds better chi2 than default tune, but some parameters seem unphysical

**Jonah Bernhard, Applying Bayesian parameter estimation to relativistic heavy-ion collisions**

- Postponed due to technical problems with Vidyo

There are minutes attached to this event.

Show them.