2-9 September 2007
Victoria, Canada
Europe/Zurich timezone
Home > Timetable > Session details > Contribution details
PDF | XML | iCal

Monitoring a WLCG Tier-1 computing facility aiming at a reliable 24/7 service

Presented by Dr. Andreas HEISS on 3 Sep 2007 from 08:00 to 08:20
Type: poster
Session: Poster 1
Track: Computer facilities, production grids and networking
Board #: 92

content

Within the Worldwide LHC Computing Grid (WLCG), a Tier-1 centre like the German GridKa computing facility has to provide significant CPU and storage resources as well as several Grid services with a high level of quality. GridKa currently supports all four LHC Experiments, Alice, Atlas, CMS and LHCb as well as four non-LHC high energy physics experiments, and is about to significantly extend its services for other communities within the German Grid initiative D-Grid. In order to ensure the simultaneous usability of the resources by all VOs as well as the persistent import of data from CERN and the distribution of data to associated Tier-2 sites, a sophisticated monitoring model is essential. We present the GridKa monitoring concept which is based on the Ganglia and Nagios systems combined with additional tools to monitor Grid services and infrastructure. Due to the complex dependencies between a high number of monitored hosts and services, a clear and simple to use 'dashboard' showing a summarized view of the monitoring information is an essential tool. This 'dashboard' allows for a quick overview of the status and performance of services during the day and will be the first source of information for a deeper problem analysis if an automatic alarm notification is sent during nights and weekends.

Place

Location: Victoria, Canada

Primary authors

More

Co-authors

More