IML Meeting, Software https://indico.cern.ch/event/565647/ Peak number of people in the room: 28 Peak number of people on vidyo: 42 Lorenzo, News * Lots of new features in TMVA, will go in next root releas (6.08), Some details presented in the subsequent talks * New series of TMVA meetings announced, see the first one here: https://indico.cern.ch/event/569308/ Simon Pfreundschuh, Deep Learning on GPUs with TMVA * Deep learning needs massively parallel processing (GPU) * Goal of the project: performant and easy to use implementation for HEP community * Quick introduction to feed-forward neural network and back propagation * Core of the implementation: express the backpropagation algorithm as matrix operations * Two layers implementation: * Low level: takes care of the architecture (CUDA/CPU/OpenCL) * High level abstracts backpropagation as matrix operations * Tested against TMatrix implementation * Performance benchmarked against the total number of floating point operations * Performance gain depends on the topology of the network * Question Gilles: Batches are very big (slide 23), normally smaller batches are used (converges faster). Do you have the performance trending plots for smaller batches? No, will be done. * Most of the time still spent on matrix algebra also in the case of the CUDA implementation * OpenCL implementation not yet ready for production * Implementation tested against a shallow NN and BDT with the Higgs benchmark dataset and compared to Theano * Questions * Low level implementation, is it only rank 2 objects? Higher rank tensor would be needed for future developments (e.g. convolutions). No plans yet. Will be considered, there may be applications. * Lorenzo: there could be further use for tensors as well. * Room: Did you try deeper networks? No, purpose was to test the implemementation, not a systematic study * Which Theano backend is used in slide 27? CUDA. It is faster than Theano, probably because of the python overhead (copies batches in python) * Is that Theano on GPU or CPU? GPU * Higgs data set: what was the share of signal and BG events? about 50% signal 50% background * Suggestion to also test on newly installed Xeon Phi openlab machine with huge number of cores to investigate scaling behaviour for CPU parallelization Attila Bagoly, Making TMVA interactive in Jupyter Notebooks * Short introduction on advantages of jupyter notebooks * Goal: interactive TMVA module for notebooks, called jsMVA. Implements: * TMVAGui visualization * Model Visualization (interactive) for NN and decision trees * Interactive training mode: training runs on independent thread, plots on notebook updated dynamically * New Python interface, cleaner handling of string options * New deep neural network designer (GUI to create NN) * TMVA output neatly formatted at HMTL * Future plans: see slides * A short demo is shown * Questions * Is it in the master? Yes, will also go in the next release. Already supports new DNN implementation Stefan Wunsch, Interface between TMVA and Keras * Short introduction on Keras and its backends * Why a Keras interface? * Huge community ⇒ stable code * Difficult to stay up-to-date reimplementing methods * Workflow * Define method in python (very simple) and save to file * Model loaded back in TMVA * Acceleration works automatically (after exporting the relevant environment variables) * Supported features, see slide 6. See backup slide 8/9/10 for models/optimizers/activation functions supported by Keras * Not yet on the master, but almost merging ready * Questions * Christian: Is elastic net regularization using lasso regression implemented (combination of L1 and L2 penalties)? You can do it in keras for any L1 or L2 parameter. yes, it's all there * How is the training done? what does C++? Just a wrapper, all the processing happens in Keras. * Michael: Can you save the training weights after optimization in hdf5 format or xml? Yes, it is a Keras feature, saved in separate file. By default after running tmva store in hdf5 file. * Gilles: Can you have your own custom activation functions? Nope bound to the ones available in Keras, not bound to weight intializations * Where does the cost function come from, TMVA? No, defined in Keras (see slide 4) given to Theano and Tensorflow. * How does it work for regression? Same, but you need to know what you are doing (linear activation on output, other settings) * Michele: When will it go to root master? not decided yet, needs testing, probably not in the next release * Performance penalty as opposed to standard keras? Probably not * Can I load a model trained outside of TMVA? yes Sergei Gleyzer, Stefan Wunsch, Machine Learning Leaderboard Proposal for LHC * Why a leader board? * ML methods have a lot of parameters. Expensive to optimize them * Standard ML parameters do not fit our needs * Goal: facilitate communication and exchange of information to evaluate the performance of method on physics dataset * Some details on the implementation and the interface are discussed in slide 5 and 6 * Key point shared benchmark datasets * Only a proposal so far (no actual work done yet) * Questions/Comments * Suggestion: evaluation on a hidden dataset to avoid overfitting (a la Kaggle)? To be considered * Methods should be shared (people che build on top of other people work) * General consensus benchmark dataset would be extremely useful * Where do they come from? Mostly simulation, if possible generic (not tied to a specific detector) * Need to agree on a non-root format if we want people outside of HEP to participate Thomas Keck, FastBDT * Implementation of a fast, robust and easy to use of stocastic-gradient BDT * Extensively used in BelleII * It has C++, C, Python on TMVA interface * D meson data set from Belle, about 1 billion event, 35 features * Orders of magnitude faster than other implementations (comparison shown for scikit-learn, XGBoost, TMVA) * Performs as well (or better) than other implementations (comparison shown for scikit-learn, XGBoost, TMVA) * There are use cases in Belle in which one has to train a few hundreds different classifiers (See example in slide 6/7) * Questions * Sergei: Any parallelization involved? No, single core implementation. When applied they are parallelized at event level * Comment Sergei: TMVA algorithms have been optimized by experiments externally for production. * Slide 4 doesn't show all custom improvements * Improvements are being backported for TMVA * Thomas: Lots of function calls in the implementation in TMVA that are not inline in the compiler, specialized implementation using inline functions will stay faster * Sergei: scikit performance seems odd. * Gilles: Performance comparison to some extent unfair because different algorithms are implemented: * scikit-learn doesn't bin variables for cut optimisations, FastBDT gains speed by binning. * Not expected to introduce a trade-off of speed vs. separation (can even reduce overtraining) Thomas: TMVA does binning as well, will check scikit learn. Does not make sense to make greedy search for most physics analysis. Approximate cut search reduces overfitting by regularizing input features, a good approach, also leading to good ROC performance. * Mauro: Does FASTBdt support multiclass classification? No. Many classifier trained when they need it. * Sergei: Can data set be shared on new IML benchmarks database? Data set is simulation based, was uploaded to UCI but didn’t make it to the website: could become one of the IML datasets * Sergei comment: on variable importance - there is an algorithm in TMVA that does precisely this based on my and Harrison’s 2008 ACAT paper (done stochastically) David Rousseau, Paul Seyfert, Common Tools Proposal and Discussion * For wish-lists from Atlas and LHCb, see slides * Many of the packages mentioned in the wishlists already provided in CVMFS, but the problems is versions. * Can one use docker? * Surely a possibility to be considered seriously * Enric: In swan, docker only encapsulate users, software comes from cvmfs. * One can use conda for offline distributions, adding binaries which we need (see example mentioned in David's slides: (https://nlesc.gitbooks.io/cern-root-conda-recipes/content/index.html) * Will follow up progress at next meeting