SAMGrid Monitoring Service and its Integration with MonALisa
Presented by A. LYON on 29 Sep 2004 from 10:00 to 10:00
Session: Poster Session 2
Track: Track 4 - Distributed Computing Services
Board #: 38
The SAMGrid team is in the process of implementing a monitoring and information service, which fulfills several important roles in the operation of the SAMGrid system, and will replace the first generation of monitoring tools in the current deployments. The first generation tools are in general based on text logfiles and represent solutions which are not scalable or maintainable. The roles of the monitoring and information service are: 1) providing diagnostics for troubleshooting the operation of SAMGrid services; 2) providing support for monitoring at the level of user jobs; 3) providing runtime support for local configuration and other information currently which currently must be stored centrally (thus moving thesystem toward greater autonomy for the SAM station services, which include cache management and job management services); 4) providing intelligent collection of statistics in order to enable performance monitoring and tuning. The architecture of this service is quite flexible, permitting input from any instrumented SAM application or service. It will allow multiple backend storage for archiving of(possibly) filtered monitoring events, as well as real time information displays andactive notification service for alarm conditions. This service will be able to export, in a configurable manner, information to higher level Grid monitoring services, such as MonALisa. We describe our experience to date with using a prototype version together with MonAlisa.