21-25 September 2009
Hotel Barcelo Sants
Europe/Zurich timezone

Monitoring the reliability of MPI support on the EGEE Infrastructure

Sep 22, 2009, 12:15 PM
10m
Sarrià (Hotel Barcelo Sants)

Sarrià

Hotel Barcelo Sants

Barcelona

Speaker

Mr Paschalis Korosoglou (AUTH)

Abstract

During the last year our helpdesk had to deal with a large number of trouble tickets regarding problems with the several MPI implementations on the Grid infrastructure. Due to these requests and in order to have a constant view of the MPI support on the infrastructure, we have developed and deployed a set of MPI probes that test the installation of MPI flavours on Grid Sites. Using these probes we test the mpich-1 implementation of MPI-1 along with the mpich-2 and openMPI implementations of the MPI-2 standard. These probes are integrated with the NAGIOS monitoring framework. Currently we have enabled the nagios MPI monitoring on the HellasGrid Infrastructure and we are on the course of introducing our probes to the wider EGEE monitoring infrastructure in collaboration with the OAT. In the EGEE09 conference we will present this tool alongside with statistics regarding the robustness of the MPI support on the infrastructure.

Primary author

Presentation materials