Speaker
Dr
Tony Chan
(BROOKHAVEN NATIONAL LAB)
Description
Monitoring a large-scale computing facility is evolving from a passive to a more active
role in the LHC era, from monitoring the health, availability and performance of the
facility to taking a more active and automated role in restoring availability, updating
software and becoming a meta-scheduler for batch systems. This talk will discuss the
experiences of the RHIC and ATLAS U.S. Tier 1 Computing Facility at Brookhaven National
Lab in evaluating different monitoring software packages and how monitoring is being
used to improve efficiency and to integrate the facility with the Grid environment.
A monitoring model to link geographically dispersed, regional computer facilities which
can be used to improve efficiency and throughput will be presented as well.
Primary authors
Mr
Alex Withers
(Brookhaven National Lab)
Dr
Bruce Gibbard
(Brookhaven National Lab)
Mr
Chris Hollowell
(Brookhaven National Lab)
Mr
Jason Smith
(Brookhaven National Lab)
Mr
John Hover
(Brookhaven National Lab)
Dr
Ofer Rind
(Brookhaven National Lab)
Mr
Rob Petkus
(Brookhaven National Lab)
Dr
Tom Throwe
(Brookhaven National Lab)
Dr
Tony Chan
(BROOKHAVEN NATIONAL LAB)
Dr
Xin Zhao
(Brookhaven National Lab)
Mrs
Zhenping Liu
(Brookhaven National Lab)