Speaker
Martin Barisits
(CERN)
Description
The ATLAS Distributed Data Management system stores more than 140PB of physics data across 100 sites worldwide. To cope with the anticipated ATLAS workload of the coming decade, Rucio, the next-generation data management system has been developed. Replica management, as one of the key aspects of the system, has to satisfy critical performance requirements in order to keep pace with the experiment's high rate of continuous data generation. The challenge lies in meeting these performance objectives while still giving the users and applications a powerful toolkit to control their data workflows. In this work we present the concept, design and implementation of the replica management in Rucio. We will specifically introduce the workflows behind replication rules, their formal language definition, weighting and site selection. Furthermore we will present the subscription component, which offers functionality for users to proclaim interest in data that has not been created yet. This contribution describes the architecture behind those components, the interfaces to other internal and external components and will show the benefits made by this system.
Author
Martin Barisits
(CERN)
Co-authors
Angelos Molfetas
(University of Sydney (AU))
Armin Nairz
(CERN)
Cedric Serfon
(CERN)
Graeme Andrew Stewart
(CERN)
Dr
Luc Goossens
(CERN)
Mario Lassnig
(CERN)
Ralph Vigne
(University of Vienna (AT))
Thomas Beermann
(Bergische Universitaet Wuppertal (DE))
Vincent Garonne
(CERN)