Speaker
Description
In many domains of science, the likelihood ratio function (LR) is a fundamental ingredient for a variety of statistical methods such as inference, importance sampling, and classification. Neural based LR estimation using probabilistic classification has therefore had a significant impact in these domains, providing a scalable method for determining an intractable LR from simulated datasets via the so-called ratio trick. Traditional machine learning approaches rely on the assumption that the underlying probability distribution is nonnegative, but in quantum mechanical systems it's possible to encounter events with negative probabilities. In high energy physics this is a significant problem when simulating proton-proton (pp) collisions using quantum field theory, due to the fact that Monte Carlo simulation codes can introduce negatively weighted data.
Two problems present themselves when training a neural likelihood ratio estimator with negatively weighted data. First, the variance of the mini-batch losses used during neural network parameter updates are systematically increased, thereby hindering the convergence of stochastic gradient descent (SGD) algorithms. The second is that most classification and density (ratio) estimation loss functions constrain the neural LR estimates to be in the range $[0,\infty)$. Therefore, should negative densities prevail anywhere within the measurable space, the neural network would be incapable of expressing such behavior.
This work will demonstrate two important advancements for LR estimation with negatively weighted data. First, a new loss function for binary classification is introduced to extend the neural based LR trick to be compatible with quasiprobabilistic distributions. Second, signed probability spaces are used to decompose the likelihoods into signed mixture models. This decomposition reduces the overall LR estimation task into four nonnegative LR estimation sub-tasks, each with reduced loss variance during optimization relative to the overall task. Each nonnegative LR is estimated using a calibrated neural discriminative classifier, which are then combined via coefficients that are optionally optimized using the new loss function. The technique is demonstrated using di-Higgs production via gluon-gluon fusion in pp collisions at the Large Hadron Collider.
| Would you like to be considered for an oral presentation? | Yes |
|---|