Speaker
Dr
Tofigh Azemoon
(Stanford Linear Accelerator Center)
Description
Petascale systems are in existence today and will become widespread in the
next few years. Such systems are inevitably very complex, highly distributed
and heterogeneous. Monitoring a petascale system in real time and
understanding its status at any given moment without impacting its
performance is a highly intricate task. Common approaches and off the shelf
tools are either unusable, do not scale, or severely impact the performance of
the servers that are monitored. This talk will describe an unobtrusive
monitoring software developed at Stanford Linear Accelerator Center (SLAC)
and currently deployed by the BaBar Experiment that uses the xrootd file
access system to access its highly distributed petascale production data set.
The system facilitates central monitoring of all BaBar Tier A centers at SLAC.
The talk will describe the employed solutions, the lessons learned, and the
issues still to be addressed, and discuss the advantages of such a system in
predicting the storage needs and understanding data access patterns. It will
further explain how the system can be deployed in other High Energy Physics
centers where the data servers may be shared by many experiments and run
under a different file access system.
Primary author
Dr
Tofigh Azemoon
(Stanford Linear Accelerator Center)
Co-authors
Mr
Andrew Hanushevsky
(Stanford Linear Accelerator Center)
Mr
Jacek Becla
(Stanford Linear Accelerator Center)
Mr
Turri Massimiliano
(Stanford Linear Accelerator Center)