Mario Lassnig (CERN)
The ATLAS Distributed Data Management system requires accounting of its contents at the metadata layer. This presents a hard problem due to the large scale of the system and the high rate of concurrent modifications of data. The system must efficiently account more than 80PB of disk and tape that store upwards of 500 million files across 100 sites globally. In this work a generic accounting system is presented, which is able to scale to the requirements of ATLAS. The design and architecture is presented, and three implementations are discussed, the reference implementation in Oracle RAC, and two alternative implementations in MongoDB and HBase. A strong emphasis is placed on the necessary design choices such that the underlying data models are generally applicable to many kinds of accounting, reporting and monitoring. The evaluation then focuses on principal architectural differences, read-insert-update-delete performance, support for concurrent operations, deployment and operational effort, and possible means to calculate the actual accounting values based on metadata critera. Finally, a recommendation is presented for the applicability of each implementation under different accounting use cases, as well as an overall recommendation for useful and required data models.
Collaboration Atlas (Atlas)