Speaker
Description
After several years of focused work, preparation for Data Release Production (DRP) of the Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) at multiple data facilities is taking its shape. Rubin Observatory DRP features both complex, long workflows with many short jobs, and fewer long jobs with sometimes unpredictably large memory usage. Both of them create scaling issues that need to be addressed in order to meet the annual processing timeline.
This paper summarizes the infrastructure and services deployed at Rubin data facilities to support multi-site data processing. Rubin selected PanDA (Production and Distributed Analysis) to orchestrate its complex workflow and to manage its distributed workload. We address the interface between workflow/workload management system and Rubin’s campaign management system, as well as the associated analytics platform, and the interface to the observatory’s data management system.
Rubin has already exercised this infrastructure to process data from other observatories as well as simulated data. The experience of those processing campaigns is summarized in this paper. Finally, this paper outlines future plans, including providing the campaign management team a higher level view on ongoing campaigns and analyzing finished campaigns as well as using PanDA to support end users' need for batch processing from within a “hybrid” cloud approach to data hosting.