Speaker
Mrs
Mona Aggarwal
(Imperial College London)
Description
The LCG is an operational Grid currently running at 136 sites in 36 countries,
offering its users access to nearly 14,000 CPUs and approximately 8PB of storage [1].
Monitoring the state and performance of such a system is challenging but vital to
successful operation. In this context the primary motivation for this research is to
analyze LCG performance by doing a statistical analysis of the lifecycles of all jobs
submitted to it. In this paper we define metrics that will describe typical job
lifecycles. The statistical analysis of these metrics enables us to gain insight into
the work load management characteristics of the LCG Grid [2]. Finally we show how
those metrics can be used to spot Grid failures by identifying statistical changes
over time in the monitored metrics.
[1] GridPP-UK Computing for Particle Physics: http://www.gridpp.ac.uk/
[2] Crosby P, Colling D, Waters D, Efficiency of resource brokering in grids for
high-energy physics computing, IEEE Transactions on Nuclear Science, 2004, Vol: 51,
Pages: 884 - 891, ISSN: 0018-9499
Primary authors
Dr
Barry MacEvoy
(Imperial College London)
Dr
David Colling
(Imperial College London)
Mrs
Mona Aggarwal
(Imperial College London)
Dr
Olivier van der Aa
(Imperial College London)