Speaker
Description
Liquid argon time projection chambers (LArTPCs) provide dense, high-fidelity 3D measurements of particle interactions and underpin many current and future neutrino and rare-event experiments. Event reconstruction typically relies on complex detector-specific pipelines that use tens of hand-engineered pattern recognition algorithms or cascades of task-specific neural networks that require extensive well-calibrated simulation.
We introduce Panda, a training paradigm that learns reusable sensor-level representations directly from raw, unlabeled 3D TPC data. We combine a hierarchical sparse 3D encoder with a multi-view, prototype-based self-distillation objective. On a simulated dataset, we show that Panda substantially improves label efficiency and reconstruction quality, with roughly the same performance as the previous state-of-the-art semantic segmentation model with 1,000$\times$ fewer labels. We also show that a single set-prediction head 5% the size of the backbone with no physical priors trained on frozen outputs from our pre-trained network can result in particle identification that is comparable with state-of-the-art reconstruction tools.
We introduce this model and training method as a step towards general purpose sensor-level foundation models for high energy and nuclear physics.