Speaker
Description
Different evaluation metrics for binary classifiers are appropriate to different scientific domains and even to different problems within the same domain. This presentation discusses the evaluation of binary classifiers in experimental high-energy physics, and in particular those used for the discrimination of signal and background events. In the introductory part of the talk, the general properties of binary classifiers for HEP are analysed, and the similarities and differences to other domains are pointed out. The rest of the presentation then focuses on the optimisation of event selection to minimise statistical errors in HEP parameter estimation, a problem that is best analysed in terms of the maximisation of Fisher information about the measured parameters. After describing a general formalism to derive evaluation metrics based on Fisher information, three more specific metrics are introduced for the measurements of signal cross sections in counting experiments (FIP1) or distribution fits (FIP2) and for the measurements of other parameters from distribution fits (FIP3). The FIP2 metric is particularly interesting because it can be derived from any ROC curve, provided that prevalence is also known. In addition to their relation to measurement errors when used as evaluation criteria (which makes them more interesting that the ROC AUC), a further advantage of Fisher information metrics is that they can also be directly used for training decision trees (instead of the Shannon entropy or Gini coefficient). Preliminary results based on the Python sklearn framework are presented.