Speaker
G. Cancio
(CERN)
Description
This paper describes the evolution of fabric management at CERN's T0/T1 Computing
Center, from the selection and adoption of prototypes produced by the European
DataGrid (EDG) project[1] to enhancements made to them.
In the last year of the EDG project, developers and service managers have been
working to understand and solve operational and scalability issues.
CERN has adopted and strengthened Quattor[2], EDG's installation and configuration
management toolsuite, for managing all Linux clusters and servers in the Computing
Center, replacing existing legacy management systems. Enhancements to the original
prototype include a redundant and scalable server architecture using proxy
technology and producing plug-in components for configuring system and LHC computing
services.
CERN now coordinates the maintenance of Quattor, making it available to other sites.
Lemon[3], the EDG fabric monitoring framework, has been progressively deployed onto
all managed Linux nodes. We have developed sensors to instrument fabric nodes to
provide us with complete performance and exception monitoring information.
Performance visualization displays and interfaces to the existing alarm system have
also been provided.
LEAF[4], the LHC-Era Automated Fabric toolset, comprises the State Management
System, a tool to enable high-level configuration commands to be issued to sets of
nodes during both hardware and service management Use Cases, and the Hardware
Management System, a tool for administering hardware workflows and for visualizing
and locating equipment.
Finally, we will describe issues currently being addressed and planned future
developments.
Primary authors
Mr
D. Front
(Weizman Institute of Science, Israel)
D. Waldron
(CERN)
G. Cancio
(CERN)
H. Meinhard
(CERN-IT)
J. van Eldik
(CERN)
M. Siket
(CERN)
M. Stepniewski
(CERN)
P. Poznanski
(CERN)
S. Chapeland
(CERN)
T. Kleinwort
(CERN)
T. Smith
(CERN)
V. Bahyl
(CERN)
V. Lefebure
(CERN)
W. Tomlin
(CERN)