Speaker
Martin Barisits
(Vienna University of Technology (AT))
Description
The ATLAS Distributed Data Management system stores more than 75PB of physics data across
100 sites globally. Over 8 million files are transferred daily with strongly varying usage
patterns. For performance and scalability reasons it is imperative to adapt and improve
the data management system continuously. Therefore future system modifications in
hardware, software as well as policy, need to be evaluated to accomplish good results and
avoid unwanted side effects. Due to the complexity of large-scale distributed systems this evaluation process is primarily based on expert-knowledge, as conventional evaluation
methods are inadequate. However, this error-prone process lacks quantitative estimations and leads to inaccuracy as well as incorrect evaluations.
In this work we present a novel, full-scale simulation framework. This flow-level based
simulator is able to accurately model the ATLAS Distributed Data Management system. The
design and architecture of the component-based software is presented and discussed. The
evaluation concentrates on the accuracy and scalability of the simulation framework. Finally, selected use-cases where simulation could be hugely beneficial to distributed data management systems are presented and discussed.
Author
Collaboration Atlas
(Atlas)
Co-authors
Angelos Molfetas
(CERN)
Mario Lassnig
(CERN)
Martin Barisits
(Vienna University of Technology (AT))
Vincent Garonne
(CERN)