Reading materials DS@LHC 2015
This documents gather a non-exhaustive list of reading materials targeted i) for physicists with sparse knowledge of machine learning and ii) for machine learning experts interested in applications of their methods to high energy physics problems.
Please contact us if you wish to add links to this list.
Machine learning for physicists
Supervised learning:
-
James, Gareth, et al. An introduction to statistical learning. New York: springer, 2013. http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Fourth%20Printing.pdf
-
Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Vol. 1. Springer, Berlin: Springer series in statistics, 2001. http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf
Deep learning:
-
Schmidhuber, Jürgen. "Deep learning in neural networks: An overview." Neural Networks 61 (2015): 85-117. http://arxiv.org/abs/1404.7828
-
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature521.7553 (2015): 436-444. http://www.nature.com/nature/journal/v521/n7553/abs/nature14539.html
Gaussian Processes:
-
Rasmussen, Carl Edward. "Gaussian processes for machine learning." (2006).
-
Neil D. Lawrence and Raquel Urtasun, Gaussian Processes, CVPR 2012. http://www.cs.toronto.edu/~urtasun/tutorials/gp_cvpr12_session1.pdf
Statistical inference:
-
Gelman, Andrew, et al. “Bayesian data analysis”. Vol. 2. London: Chapman & Hall/CRC, 2014.
-
Beaumont, Mark A., Wenyang Zhang, and David J. Balding. "Approximate Bayesian computation in population genetics." Genetics 162.4 (2002): 2025-2035. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1462356/pdf/12524368.pdf
-
Approximate Bayesian Computation, https://en.wikipedia.org/wiki/Approximate_Bayesian_computation
-
Cranmer, Kyle. "Approximating Likelihood Ratios with Calibrated Discriminative Classifiers." arXiv preprint arXiv:1506.02169 (2015). http://arxiv.org/abs/1506.02169
High energy physics for data scientists
-
Gligorov, Vladimir. “Real time data analysis at the LHCb present and future”. No. LHCb-TALK-2014-406. 2014. http://jmlr.org/proceedings/papers/v42/glig14.pdf
-
Volobouev, Igor. "Matrix Element Method in HEP: Transfer Functions, Efficiencies, and Likelihood Normalization." http://arxiv.org/abs/1101.2259
Examples of machine learning in high energy physics:
-
Roe, Byron P., et al. "Boosted decision trees as an alternative to artificial neural networks for particle identification." Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 543.2 (2005): 577-584. http://arxiv.org/abs/physics/0408124
-
Yang, Hai-Jun, Byron P. Roe, and Ji Zhu. "Studies of boosted decision trees for MiniBooNE particle identification." Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 555.1 (2005): 370-385. http://arxiv.org/abs/physics/0508045
-
Proceedings of the HEPML 2014 workshop. http://jmlr.org/proceedings/papers/v42/
-
Gligorov, Vladimir Vava, and Mike Williams. "Efficient, reliable and fast high-level triggering using a bonsai boosted decision tree." Journal of Instrumentation 8.02 (2013): P02013. http://arxiv.org/abs/1210.6861
-
Benbouzid, Djalel, Róbert Busa-Fekete, and Balázs Kégl. "Fast classification using sparse decision DAGs." arXiv preprint arXiv:1206.6387 (2012). http://arxiv.org/abs/1206.6387
-
Rogozhnikov, Alex, et al. "New approaches for boosting to uniformity." Journal of Instrumentation 10.03 (2015): T03002. http://arxiv.org/abs/1410.4140
-
Likhomanenko, Tatiana, et al. "LHCb Topological Trigger Reoptimization." arXiv preprint arXiv:1510.00572 (2015). http://arxiv.org/abs/1510.00572
Software
General purpose libraries for ML:
-
TMVA: Toolkit for Multivariate Analysis with ROOT, http://root.cern.ch, -
-
User guide, http://tmva.sourceforge.net/docu/TMVAUsersGuide.pdf
-
RMVA, PyMVA, ROOT R, http://oproject.org/tiki-index.php?page=Projects
-
-
Scikit-Learn: A machine learning library for Python, http://scikit-learn.org/dev/
Deep Learning:
-
Theano: Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently, http://www.deeplearning.net/software/theano/
-
Keras: Theano-based deep learning library for Python, http://keras.io/
-
Chainer: A flexible framework for neural networks in Python, http://chainer.org/
-
Theanets: a deep learning and neural network toolkit for Python, https://github.com/lmjohns3/theanets
-
nVidia Deep Learning GPU training system (DIGITS), https://developer.nvidia.com/digits
Gaussian processes:
-
George: Fast Gaussian Processes for Regression in Python, http://dan.iel.fm/george
-
GPy: Gaussian Processes framework for Python, https://github.com/SheffieldML/GPy