Speaker
Description
Geometric learning pipelines have achieved state-of-the-art performance in High-Energy and Nuclear Physics reconstruction tasks like flavor tagging and particle tracking [1]. Starting from a point cloud of detector or particle-level measurements, a graph can be built where the measurements are nodes, and where the edges represent all possible physics relationships between the nodes. Depending on the size of the resulting input graph, a filtering stage may be needed to sparsify the graph connections. A Graph Neural Network will then build a latent representation of the input graph that can be used to predict, for example, whether two nodes (measurements) belong to the same particle or to classify a node as noise. The graph may then be partitioned into particle-level subgraphs, and a regression task used to infer the particle properties. Evaluating the uncertainty of the overall pipeline is important to measure and increase the statistical significance of the final result. How do we measure the uncertainty of the predictions of a multistep pattern recognition pipeline? How do we know which step of the pipeline contributes the most to the prediction uncertainty, and how do we distinguish between irreducible uncertainties arising from the aleatoric nature of our input data (detector noise, multiple scattering, etc) and epistemic uncertainties that we could reduce by using, for example, a larger model, or more training data?
We have developed an Uncertainty Quantification process for multistep pipelines to study these questions and applied it to the acorn particle tracking pipeline [2]. All our experiments are made using the TrackML open dataset [3]. Using the Monte Carlo Dropout method, we measure the data and model uncertainties of the pipeline steps, study how they propagate down the pipeline, and how they are impacted by the training dataset's size, the input data's geometry and physical properties. We will show that for our case study, as the training dataset grows, the overall uncertainty becomes dominated by aleatoric uncertainty, indicating that we had sufficient data to train the acorn model we chose to its full potential. We show that the ACORN pipeline yields high confidence in the track reconstruction and does not suffer from the miscalibration of the GNN model.
References:
[1] [2203.12852] Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges
[2] acorn - GNN4ITkTeam
[3] Data - TrackML particle tracking challenge