21–27 Mar 2009
Prague
Europe/Prague timezone

A Grid Job Monitoring System

23 Mar 2009, 08:00
1h
Prague

Prague

Prague Congress Centre 5. května 65, 140 00 Prague 4, Czech Republic
Board: Monday 075
poster Distributed Processing and Analysis Poster session

Speaker

Dr Sanjay Padhi (UCSD)

Description

This paper presents a web based Job Monitoring framework for individual Grid sites that allows users to follow in detail their jobs in quasi-real time. The framework consists of several independent components, (a) a set of sensors that run on the site CE and worker nodes and update a database, (b) a simple yet extensible web services framework and (c) an Ajax powered web interface having a look-and-feel and control similar to a desktop application. The monitoring framework supports LSF, Condor and PBS-like batch systems. This is the first such monitoring system where an X509 authenticated web interface can be seamlessly accessed by both end-users and site administrators. While a site administrator has access to all the possible information, a user can only view the jobs for the Virtual Organizations (VO) he/she is a part of. The monitoring framework design supports several possible deployment scenarios. For a site running a supported batch system, the system may be deployed as a whole, or existing site sensors can be adapted and reused with our web services components. A site may even prefer to build the web server independently and choose to use only the Ajax powered web interface. Finally, the system is being used to monitor a glideinWMS instance. This broadens its scope significantly, allowing it to monitor jobs over multiple sites.

Primary authors

Dr Sanjay Padhi (UCSD) Subir Sarkar (Sezione dell' INFN, Pisa)

Presentation materials