Speaker
Description
In this talk I will present a recent strategy to perform a goodness-of-fit test via two-sample testing, powered by machine learning. This approach allows to evaluate the discrepancy between a data sample of interest and a reference sample, in an unbiased and statistically sound fashion. The model leverages the ability of classifiers to estimate the density ratio of the data-generating distributions in order to build a statistical test based on the Neyman—Pearson approach (arXiv:2305.14137). I will discuss the general framework and focus on an implementation based kernel methods which is efficient while maintaining high flexibility (arXiv:2204.02317). Initially developed to perform model-independent searches of new physics with collider data, it can be used effectively for different tasks such as online data quality monitoring (arXiv:2303.05413), and the evaluation of simulators and generative models.