Speaker
RANIT DAS
Description
Feature selection algorithms can be an important tool for AI explainability. If the performance of neural networks trained on low-level data can be reproduced by a small set of high-level features, we can hope to understand “what the machine learned”. We present a new algorithm that selects features by ranking their Distance Correlation (DisCo) values with truth labels. We apply this algorithm to the classification of boosted top quarks and use a set of 7,000 Energy Flow Polynomials (EFPs) as our feature space. We show that our method is able to select a small set of high-level features, with a classification performance comparable to the state-of-the-art top taggers.