Speaker
Description
Large-scale scientific experiments, such as those in gravitational-wave (GW) science, produce extensive datasets that are often stored in isolated data lakes. The second-generation interferometers—LIGO, Virgo, and KAGRA—are part of an international scientific network, the International Gravitational-Wave Observatory Network (IGWN). A similar framework is envisaged for the third-generation interferometers: the Einstein Telescope (ET) in Europe and Cosmic Explorer (CE) in the United States. Data from the interferometers must be readily shared among scientists across all collaborations, enabling coincidence detection to distinguish genuine astrophysical signals from local noise and to achieve accurate sky localization. Data distribution, storage, and access should follow FAIR principles to streamline data analysis. Currently, both LIGO and Virgo use Rucio for data distribution, and ET is evaluating it as a Distributed Data Management (DDM) system.
In this context, two projects—MADDEN and ETAP— were funded by the first OSCARS (Open Science Cluster’s Action for Research and Society) Open Call. MADDEN (Multi-RI Access and Discovery of Data for Experiment Networking) focuses on enhancing Rucio functionalities to better meet the requirements of the GW community. Within MADDEN, we will extend Rucio to support multi-RI (Research Infrastructure) data lakes, simplifying authentication and user management. We will present the design and implementation of a POSIX-like view of the Rucio catalogue in a multi-RI environment and provide support for advanced metadata queries. These capabilities will be showcased and integrated into ETAP (Einstein Telescope Analysis Portal), which will provide a complete environment for data analysis for ET to be used in the next ET Mock Data Challenges. In this contribution we will also report on the participation in the ESCAPE xRIDGE data challenge planned for early 2026.