Speaker
Mark Mitchell
(University of Glasgow)
Description
The monitoring of a grid cluster (or of any piece of reasonably scaled IT infrastructure) is a key element in the robust and consistent running of that site. There are several factors which are important to the selection of a useful monitoring framework, which include ease of use, reliability, data input and output. It is critical that data can be drawn from different instrumentation packages and collected in the framework to allow for a uniform view of the running of a site. It is also very useful to allow different views and transformations of this data to allow its manipulation for different purposes, perhaps unknown at the initial time of installation. In this context, we firstly present the findings of an investigation of the Graphite monitoring framework and its use at the Scotgrid Glasgow site. In particular, we examine the messaging system used by the framework and means to extract data from different tools, including the existing framework Ganglia which is in use at many sites, in addition to adapting and parsing data streams from external monitoring frameworks and websites. We also look at different views in which the data can be presented to allow for different use cases. We report on the installation and maintenance of the framework from a system manager perspective.
This work is relevant to site managers and anyone interested in high level, adaptable site monitoring.
Primary author
David Crooks
(University of Glasgow (GB))
Co-authors
Prof.
David Britton
(University of Glasgow (GB))
Gareth Roy
(U)
Mark Mitchell
(University of Glasgow)
Dr
Samuel Cadellin Skipsey
Stuart Purdie
(University of Glasgow)