Sep 2 – 9, 2007
Victoria, Canada
Europe/Zurich timezone
Please book accomodation as soon as possible.

Monitoring a WLCG Tier-1 computing facility aiming at a reliable 24/7 service

Sep 3, 2007, 8:00 AM
10h 10m
Victoria, Canada

Victoria, Canada

Board: 92
poster Computer facilities, production grids and networking Poster 1

Speaker

Dr Andreas Heiss (Forschungszentrum Karlsruhe)

Description

Within the Worldwide LHC Computing Grid (WLCG), a Tier-1 centre like the German GridKa computing facility has to provide significant CPU and storage resources as well as several Grid services with a high level of quality. GridKa currently supports all four LHC Experiments, Alice, Atlas, CMS and LHCb as well as four non-LHC high energy physics experiments, and is about to significantly extend its services for other communities within the German Grid initiative D-Grid. In order to ensure the simultaneous usability of the resources by all VOs as well as the persistent import of data from CERN and the distribution of data to associated Tier-2 sites, a sophisticated monitoring model is essential. We present the GridKa monitoring concept which is based on the Ganglia and Nagios systems combined with additional tools to monitor Grid services and infrastructure. Due to the complex dependencies between a high number of monitored hosts and services, a clear and simple to use 'dashboard' showing a summarized view of the monitoring information is an essential tool. This 'dashboard' allows for a quick overview of the status and performance of services during the day and will be the first source of information for a deeper problem analysis if an automatic alarm notification is sent during nights and weekends.

Primary author

Dr Andreas Heiss (Forschungszentrum Karlsruhe)

Co-authors

Mr Axel Jaeger (FORSCHUNGSZENTRUM KARLSRUHE) Mr Bernhard Verstege (FORSCHUNGSZENTRUM KARLSRUHE) Mr Bruno Hoeft (FORSCHUNGSZENTRUM KARLSRUHE) Dr Holger Marten (FORSCHUNGSZENTRUM KARLSRUHE)

Presentation materials

There are no materials yet.