Speaker
Description
The Jiangmen Underground Neutrino Observatory (JUNO) is a multipurpose neutrino experiment designed to determine the neutrino mass ordering and to achieve high-precision measurements of neutrino oscillation parameters. Construction of the JUNO detector was completed at the end of 2024, followed by commissioning of the water phase and the subsequent liquid scintillator filling phase. Physics data taking began on 26 August 2025, and JUNO released its first oscillation results based on the initial 59.1 days of physics data.
This contribution reports on the design, deployment, and operational experience of the offline data processing system supporting JUNO’s first-year commissioning and physics data taking. Detector data produced at 40 GB/s are reduced online to approximately 90 MB/s of byte-stream RAW data and transferred to the Tier-0 site via a dedicated high-bandwidth network. At Tier-0, an automated processing pipeline converts RAW data into a ROOT-based data model (RTRAW) and performs prompt event reconstruction. All data products are subsequently distributed to Tier-1 sites for large-scale reprocessing and analysis. RTRAW data are reprocessed with refined calibration constants and reconstruction algorithms to produce the final datasets used for physics analyses.
Despite extensive data-challenge campaigns prior to data taking, several nontrivial challenges emerged during commissioning and early physics data taking. These include event sorting within the automated pipeline to support time-correlation analyses, dynamic reconstruction steering to accommodate multiple event types and algorithms, and significant file-system pressure caused by large-scale concurrent analysis jobs. To address these issues, optimized workflow orchestration, flexible reconstruction control, and compact analysis-oriented data formats retaining access to hit-level information were developed and deployed.
Finally, plans for near-term evolution of the offline system are presented, including increased data aggregation to reduce file counts and the deployment of multi-threaded reconstruction algorithms to improve processing efficiency in production.