In current and future high-energy physics experiments, the sensitivity of selection-based analysis will increasingly depend on the choice of the set of high-level features determined for each collision. The complexity of event reconstruction algorithms has escalated in the last decade, and thousands of parameters are available for analysts. Deep Learning approaches are widely used to improve the selection performance in physics analysis.
In many cases, the development of the algorithm is based on a brute force approach where all the possible combinations of available neural network architectures are tested using all the available parameters. A crucial aspect is that the results from a model based on a large number of input variables are more difficult to explain and understand. This point becomes relevant for neural network models since they do not provide uncertainty estimation and are often treated as perfect tools, which they are not.
In this work, we show how using a sub-optimal set of input features can lead to higher systematic uncertainty associated with classifier predictions. We also present an approach to selecting an optimal set of features using ensemble learning algorithms. For this study, we considered the case of highly boosted di-jet resonances produced in $pp$ collisions decaying to two $b$-quarks to be selected against an overwhelming QCD background. Results from a Monte Carlo simulation with HEP pseudo-detectors are shown.