UQ4ML | COMETA Workshop on Uncertainty Quantification for Machine Learning

Name: UQ4ML | COMETA Workshop on Uncertainty Quantification for Machine Learning
Start: 2025-09-15T14:00:00+02:00
End: 2025-09-18T22:15:00+02:00
Location: CEA Paris-Saclay

15–18 Sept 2025

CEA Paris-Saclay

Europe/Paris timezone

Contacts

Uncertainty Quantification in an ML Pattern Recognition Pipeline

17 Sept 2025, 15:30

30m

Amphithéâtre Claude Bloch (IPhT) (CEA Paris-Saclay)

Amphithéâtre Claude Bloch (IPhT)

CEA Paris-Saclay

Bât. 774 - Institut de Physique Théorique (IPhT), F-91190 Gif-sur-Yvette, France

Short-talk Deep Learning and Uncertainty Quantification Deep Learning and Uncertainty Quantification

Lukas Péron

Geometric learning pipelines have achieved state-of-the-art performance in High-Energy and Nuclear Physics reconstruction tasks like flavor tagging and particle tracking [1]. Starting from a point cloud of detector or particle-level measurements, a graph can be built where the measurements are nodes, and where the edges represent all possible physics relationships between the nodes. Depending on the size of the resulting input graph, a filtering stage may be needed to sparsify the graph connections. A Graph Neural Network will then build a latent representation of the input graph that can be used to predict, for example, whether two nodes (measurements) belong to the same particle or to classify a node as noise. The graph may then be partitioned into particle-level subgraphs, and a regression task used to infer the particle properties. Evaluating the uncertainty of the overall pipeline is important to measure and increase the statistical significance of the final result. How do we measure the uncertainty of the predictions of a multistep pattern recognition pipeline? How do we know which step of the pipeline contributes the most to the prediction uncertainty, and how do we distinguish between irreducible uncertainties arising from the aleatoric nature of our input data (detector noise, multiple scattering, etc) and epistemic uncertainties that we could reduce by using, for example, a larger model, or more training data?

We have developed an Uncertainty Quantification process for multistep pipelines to study these questions and applied it to the acorn particle tracking pipeline [2]. All our experiments are made using the TrackML open dataset [3]. Using the Monte Carlo Dropout method, we measure the data and model uncertainties of the pipeline steps, study how they propagate down the pipeline, and how they are impacted by the training dataset's size, the input data's geometry and physical properties. We will show that for our case study, as the training dataset grows, the overall uncertainty becomes dominated by aleatoric uncertainty, indicating that we had sufficient data to train the acorn model we chose to its full potential. We show that the ACORN pipeline yields high confidence in the track reconstruction and does not suffer from the miscalibration of the GNN model.

References:
[1] [2203.12852] Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges
[2] acorn - GNN4ITkTeam
[3] Data - TrackML particle tracking challenge

Lukas Péron

Jay Chan (Lawrence Berkeley National Lab. (US)) Paolo Calafiura (Lawrence Berkeley National Lab. (US)) Xiangyang Ju (Lawrence Berkeley National Lab. (US))

ACORN_UQ_UP_LukasPeron.pdf

arXiv paper

UQ4ML | COMETA Workshop on Uncertainty Quantification for Machine Learning

Contacts

Uncertainty Quantification in an ML Pattern Recognition Pipeline

Amphithéâtre Claude Bloch (IPhT)

CEA Paris-Saclay

Speaker

Description

Author

Co-authors

Presentation materials

Choose timezone

UQ4ML | COMETA Workshop on Uncertainty Quantification for Machine Learning

Contacts

Speaker

Description

Author

Co-authors

Presentation materials