28 July 2020 to 6 August 2020
virtual conference
Europe/Prague timezone

Automated selection of particle-jet features for data analysis inHigh Energy Physics experiments

28 Jul 2020, 17:10
virtual conference

virtual conference

Talk 14. Computing and Data Handling Computing and Data Handling


Mr Andrea Di Luca (Universita degli Studi di Trento and INFN (IT))


In high-energy physics experiments, the sensitivity of selection-based analyses critically depends on which observable quantities are taken into consideration and which ones are discarded as considered least important. In this process, scientists are usually guided by their cultural background and by literature.
Yet simple and powerful, this approach may be sub-optimal when machine learning strategies are envisaged and potentially all features are usable. On the other hand, training multivariate algorithms with all available features is often impossible, due to lack of calibration or computing power limitations. How to robustly choose the set of observables to use in a modern high-energy physics analysis?
We show here that it is possible to rank the relative importance of all available features in an automated fashion by engineering a fast and powerful classification model.
Features are sorted with the Random Forest algorithm, then selected as input quantities for a Deep Learning Neural Network. We make it explicit the relation between Random Forest importance ranking and signal-to-background ratio increase, varying the number of features to feed the Neural Network with. We benchmark our procedure with the case of highly boosted di-jet resonances decaying to two b~quarks, to be selected against an overwhelming QCD background. Promising results from Monte Carlo simulation with HEP pseudo-detectors are shown.

Secondary track (number) 14

Primary authors

Mr Andrea Di Luca (Universita degli Studi di Trento and INFN (IT)) Francesco Maria Follega (Universita degli Studi di Trento and INFN (IT)) Dr Marco Cristoforetti (Universita degli Studi di Trento e INFN (IT)) Roberto Iuppa (Universita degli Studi di Trento and INFN (IT))

Presentation Materials