25–29 May 2026
Chulalongkorn University
Asia/Bangkok timezone

Toward a Foundation Model for Neutrino Physics: Self-distillation of Reusable Sensor-level Representations

27 May 2026, 14:57
18m
MHMK 201

MHMK 201

Oral Presentation Track 3 - Offline data processing Track 3 - Offline data processing

Speaker

Sam Young

Description

Liquid argon time projection chambers (LArTPCs) provide dense, high-fidelity 3D measurements of particle interactions and underpin many current and future neutrino and rare-event experiments. Event reconstruction typically relies on complex detector-specific pipelines that use tens of hand-engineered pattern recognition algorithms or cascades of task-specific neural networks that require extensive well-calibrated simulation.

We introduce Panda, a training paradigm that learns reusable sensor-level representations directly from raw, unlabeled 3D TPC data. We combine a hierarchical sparse 3D encoder with a multi-view, prototype-based self-distillation objective. On a simulated dataset, we show that Panda substantially improves label efficiency and reconstruction quality, with roughly the same performance as the previous state-of-the-art semantic segmentation model with 1,000$\times$ fewer labels. We also show that a single set-prediction head 5% the size of the backbone with no physical priors trained on frozen outputs from our pre-trained network can result in particle identification that is comparable with state-of-the-art reconstruction tools.

We introduce this model and training method as a step towards general purpose sensor-level foundation models for high energy and nuclear physics.

Author

Co-author

Presentation materials