Speaker
Description
The High Energy cosmic Radiation Detection facility (HERD) is a long-term space-based high-energy physics experiment onboard the China Space Station, expected to produce large and heterogeneous datasets, including flight data, simulation data, and multi-version reconstructed data. To efficiently support large-scale computing and long-term physics analysis, a unified data management and workflow system, HERD DOM(HERD Dataflow Management and Operation Monitoring ), is under development.
This contribution presents the workflow-driven automation of simulation data production in HERD. A visual DAG-based workflow engine is employed to orchestrate simulation tasks, enabling an end-to-end automated pipeline covering parameter configuration, job submission, distributed resource scheduling, data validation, distributed storage registration, and metadata cataloguing. The workflow system is tightly integrated with local clusters environments, and simulation data are managed through Rucio, ensuring full traceability, reproducibility, and scalability of the data production process.
A prototype of the simulation workflow system has been deployed and is operating stably, supporting large-scale automated simulation production, job monitoring, and data management. The workflows for flight data processing, calibration data management, and integrated operational monitoring have been fully designed and will be progressively validated with real data in the next development stages. This work provides a practical solution for large-scale data processing in long-running space-based high-energy physics experiments.