Data Science @ LHC 2015 Workshop

Name: Data Science @ LHC 2015 Workshop
Start: 2015-11-09T08:30:00+01:00
End: 2015-11-13T18:00:00+01:00
Location: CERN

9–13 Nov 2015

CERN

Europe/Zurich timezone

There is a live webcast for this event.

Reading materials

Reading materials DS@LHC 2015

This documents gather a non-exhaustive list of reading materials targeted i) for physicists with sparse knowledge of machine learning and ii) for machine learning experts interested in applications of their methods to high energy physics problems.

Please contact us if you wish to add links to this list.

Machine learning for physicists

Supervised learning:

James, Gareth, et al. An introduction to statistical learning. New York: springer, 2013. http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Fourth%20Printing.pdf

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Vol. 1. Springer, Berlin: Springer series in statistics, 2001. http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf

Deep learning:

Schmidhuber, Jürgen. "Deep learning in neural networks: An overview." Neural Networks 61 (2015): 85-117. http://arxiv.org/abs/1404.7828

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature521.7553 (2015): 436-444. http://www.nature.com/nature/journal/v521/n7553/abs/nature14539.html

Gaussian Processes:

Rasmussen, Carl Edward. "Gaussian processes for machine learning." (2006).

Neil D. Lawrence and Raquel Urtasun, Gaussian Processes, CVPR 2012. http://www.cs.toronto.edu/~urtasun/tutorials/gp_cvpr12_session1.pdf

Statistical inference:

Gelman, Andrew, et al. “Bayesian data analysis”. Vol. 2. London: Chapman & Hall/CRC, 2014.

Beaumont, Mark A., Wenyang Zhang, and David J. Balding. "Approximate Bayesian computation in population genetics." Genetics 162.4 (2002): 2025-2035. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1462356/pdf/12524368.pdf

Approximate Bayesian Computation, https://en.wikipedia.org/wiki/Approximate_Bayesian_computation

Cranmer, Kyle. "Approximating Likelihood Ratios with Calibrated Discriminative Classifiers." arXiv preprint arXiv:1506.02169 (2015). http://arxiv.org/abs/1506.02169

High energy physics for data scientists

Gligorov, Vladimir. “Real time data analysis at the LHCb present and future”. No. LHCb-TALK-2014-406. 2014. http://jmlr.org/proceedings/papers/v42/glig14.pdf
Volobouev, Igor. "Matrix Element Method in HEP: Transfer Functions, Efficiencies, and Likelihood Normalization." http://arxiv.org/abs/1101.2259

Examples of machine learning in high energy physics:

Roe, Byron P., et al. "Boosted decision trees as an alternative to artificial neural networks for particle identification." Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 543.2 (2005): 577-584. http://arxiv.org/abs/physics/0408124
Yang, Hai-Jun, Byron P. Roe, and Ji Zhu. "Studies of boosted decision trees for MiniBooNE particle identification." Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 555.1 (2005): 370-385. http://arxiv.org/abs/physics/0508045

Proceedings of the HEPML 2014 workshop. http://jmlr.org/proceedings/papers/v42/
Gligorov, Vladimir Vava, and Mike Williams. "Efficient, reliable and fast high-level triggering using a bonsai boosted decision tree." Journal of Instrumentation 8.02 (2013): P02013. http://arxiv.org/abs/1210.6861
Benbouzid, Djalel, Róbert Busa-Fekete, and Balázs Kégl. "Fast classification using sparse decision DAGs." arXiv preprint arXiv:1206.6387 (2012). http://arxiv.org/abs/1206.6387
Rogozhnikov, Alex, et al. "New approaches for boosting to uniformity." Journal of Instrumentation 10.03 (2015): T03002. http://arxiv.org/abs/1410.4140
Likhomanenko, Tatiana, et al. "LHCb Topological Trigger Reoptimization." arXiv preprint arXiv:1510.00572 (2015). http://arxiv.org/abs/1510.00572

Software

General purpose libraries for ML:

TMVA: Toolkit for Multivariate Analysis with ROOT, http://root.cern.ch, -
- User guide, http://tmva.sourceforge.net/docu/TMVAUsersGuide.pdf
- RMVA, PyMVA, ROOT R, http://oproject.org/tiki-index.php?page=Projects
Scikit-Learn: A machine learning library for Python, http://scikit-learn.org/dev/

Deep Learning:

Caffe, http://caffe.berkeleyvision.org/

Theano: Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently, http://www.deeplearning.net/software/theano/
Keras: Theano-based deep learning library for Python, http://keras.io/
Chainer: A flexible framework for neural networks in Python, http://chainer.org/
Theanets: a deep learning and neural network toolkit for Python, https://github.com/lmjohns3/theanets
nVidia Deep Learning GPU training system (DIGITS), https://developer.nvidia.com/digits

Gaussian processes:

George: Fast Gaussian Processes for Regression in Python, http://dan.iel.fm/george
GPy: Gaussian Processes framework for Python, https://github.com/SheffieldML/GPy

Choose timezone

Data Science @ LHC 2015 Workshop

Machine learning for physicists

High energy physics for data scientists

Software