Speaker
Mario Lassnig
(CERN)
Description
The ATLAS Distributed Data Management system requires accounting of its contents at the metadata layer. This presents a hard problem
due to the large scale of the system and the high rate of concurrent modifications of data. The system must efficiently account more than 80PB of disk and tape that store upwards of
500 million files across 100 sites globally.
In this work a generic accounting system is presented, which is able to scale to the requirements of ATLAS. The design and architecture is presented, and three implementations are discussed, the reference
implementation in Oracle RAC, and two alternative implementations in MongoDB and HBase. A strong emphasis is placed on the necessary design choices such that the underlying data models are generally applicable to many kinds of accounting, reporting and monitoring. The evaluation then focuses on principal architectural differences,
read-insert-update-delete performance, support for concurrent operations, deployment and operational effort, and possible means to
calculate the actual accounting values based on metadata critera. Finally, a recommendation is presented for the applicability of each
implementation under different accounting use cases, as well as an overall recommendation for useful and required data models.
Author
Collaboration Atlas
(Atlas)
Co-authors
Gancho Dimitrov
(Brookhaven National Laboratory (US))
Lisa Azzurra Chinzer
(Universita e INFN (IT))
Luca Canali
(CERN)
Mario Lassnig
(CERN)
Vincent Garonne
(CERN)