27 September 2004 to 1 October 2004
Interlaken, Switzerland
Europe/Zurich timezone

MonALISA: An Agent Based, Dynamic Service System to Monitor, Control and Optimize Grid based Applications.

30 Sept 2004, 16:30
20m
Theatersaal (Interlaken, Switzerland)

Theatersaal

Interlaken, Switzerland

oral presentation Track 4 - Distributed Computing Services Distributed Computing Services

Speaker

I. Legrand (CALTECH)

Description

The MonALISA (MONitoring Agents in A Large Integrated Services Architecture) system is a scalable Dynamic Distributed Services Architecture which is based on the mobile code paradigm. An essential part of managing a global system, like the Grids, is a monitoring system that is able to monitor and track the many site facilities, networks, and all the task in progress, in real time. MonALISA is designed to easily integrate existing monitoring tools and procedures and to provide this information in a dynamic, self describing way to any other services or clients. The monitoring information gathered is essential for developing higher level services that provide decision support, and eventually some degree of automated decisions, to help maintain and optimize workflow through the Grid. MonALISA is an ensemble of autonomous multi-threaded, agent-based subsystems which are registered as dynamic services and are able to collaborate and cooperate in performing a wide range of monitoring, data processing and control tasks in large scale distributed applications. We also present the development of specialized higher level services, implemented as distributed mobile agents in the MonALISA framework to control and globally optimize tasks as grid scheduling, real-time data streaming or effective file replication.I The system is currently used to monitor several large scale systems and provides detailed information for computing nodes, LAN and WAN network components, job execution and applications specific parameters. This distributed system proved to be reliable, able to correctly handle connectivity problems and is running around the clock on more than 120 sites.

Primary authors

C Grigoras (Polytechnic University Bucharest) C. Cirstoiu (Polytechnic University Bucharest) C. Dobre (Polytechnic University Bucharest) H. Newman (Caltech) I. Legrand (CALTECH) M. Toarta (Polytechnic University Bucharest) R. Voicu (CERN)

Presentation materials