Speaker
Dr
Sanjay Padhi
(UCSD)
Description
This paper presents a web based Job Monitoring framework for individual Grid sites that allows users to follow in detail their jobs in quasi-real time. The framework consists of several independent components, (a) a set of sensors that run on the site CE and worker nodes and update a database, (b) a simple yet extensible web services framework and (c) an Ajax powered web interface having a look-and-feel and control similar to a desktop application. The monitoring framework supports LSF, Condor and PBS-like batch systems.
This is the first such monitoring system where an X509 authenticated web interface can be seamlessly accessed by both end-users and site administrators. While a site administrator has access to all the possible information, a user can only view the jobs for the Virtual Organizations (VO) he/she is a part of.
The monitoring framework design supports several possible deployment scenarios. For a site running a supported batch system, the system may be deployed as a whole, or existing site sensors can be adapted and reused with our web services components. A site may even prefer to build the web server independently and choose to use only the Ajax powered web interface.
Finally, the system is being used to monitor a glideinWMS instance. This broadens its scope significantly, allowing it to monitor jobs over multiple sites.
Authors
Dr
Sanjay Padhi
(UCSD)
Subir Sarkar
(Sezione dell' INFN, Pisa)