The ATLAS Data Management system - Rucio: commissioning, migration and operational experiences

Apr 13, 2015, 3:15 PM
15m
B250 (B250)

B250

B250

oral presentation Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing Track 4 Session

Speaker

Vincent Garonne (CERN)

Description

For more than 8 years, the Distributed Data Management (DDM) system of ATLAS called DQ2 has been able to demonstrate very large scale data management capabilities with more than 600M files, 160 petabytes spread worldwide across 130 sites, and accesses from 1,000 active users. However, the system does not scale for LHC run2 and a new DDM system called Rucio has been developed to be DQ2's successor. Rucio is based on different concepts and has new functionalities not provided by DQ2 which make the migration from the old to the new system a big challenge. The main issues are the large amount of data to move between the two systems, the number of users affected by the change, and the fact that the ATLAS Distributing Computing system, on the contrary to the sub-detectors, must stay continuously up and running during the LHC long shutdown to ensure the continuity of analysis and Monte-Carlo production. We will detail here the difficulties of this transition and will present the steps that were realized to ensure a smooth and transparent transition from DQ2 to Rucio. We will also discuss the new features and gains from the Rucio system.

Authors

Co-authors

Alessandro Di Girolamo (CERN) David Cameron (University of Oslo (NO)) Dr Mario Lassnig (CERN) Martin Barisits (CERN) Ralph Vigne (University of Vienna (AT)) Thomas Beermann (Bergische Universitaet Wuppertal (DE)) Tomas Kouba (Acad. of Sciences of the Czech Rep. (CZ)) Wen Guan (University of Wisconsin (US))

Presentation materials