22โ€“26 Sept 2008
Harbiye Askeri Museum
Europe/Zurich timezone

Session

Nagios for Site Monitoring Tutorial

24 Sept 2008, 11:00
Harbiye Askeri Museum

Harbiye Askeri Museum

Istanbul

Description

A monitoring system enables grid site administrators to track usage of site resources and receive alarms in case of failure of services. As such it is essential for achieving better availability and reliability of large scale grid infrastructure.

A site-level grid services monitoring prototype based on the Nagios fabric monitoring system was developed within the EGEE-II project. Development of the system is continued within the Operations Automation Team in the EGEE-III project. The prototype enables sites to receive instant notification in case of host and service failures, and provides them with results from global monitoring systems such as SAM and the ENOC DownCollector.

Main aim of this session is to give overview of the Nagios based site-level monitoring prototype and demonstrate installation on a live grid site. The first part consists of presentations describing general Nagios monitoring framework and specific components of developed site-level monitoring prototype. In the second part practical installation of site-level monitoring prototype will be demonstrated.

Presentation materials

There are no materials yet.

  1. James Casey (CERN)
    24/09/2008, 11:00
    In this talk multi-level monitoring framework based on Nagios is presented. Multi-level architecture consists of site-level Nagios monitoring instances and ROC-level Nagios instances gathering results and monitoring all sites in regions.
    Go to contribution page
  2. Ronald Starink (Unknown)
    24/09/2008, 11:30
    Nagios is an open source framework for monitoring network hosts and services with the purpose of failure detection and automatic service recovery. In this talk main functionalities of Nagios framework will be presented with the emphasis on practical issues and useful howtos.
    Go to contribution page
  3. Emir Imamagic (Unknown)
    24/09/2008, 12:00
    A site-level grid services monitoring prototype based on the Nagios fabric monitoring system was developed within the CE ROC in EGEE-II, and was used as a core component for the WLCG Grid Service Monitoring working group activities. The prototype enables sites to receive instant notification in case of host and service failures, and provides them with results from global monitoring systems...
    Go to contribution page
Building timetable...