13–17 Feb 2006
Tata Institute of Fundamental Research
Europe/Zurich timezone

MonALISA : A Distributed Service for Monitoring, Control and Global Optimization

15 Feb 2006, 09:00
9h 10m
Tata Institute of Fundamental Research

Tata Institute of Fundamental Research

Homi Bhabha Road Mumbai 400005 India
poster Grid middleware and e-Infrastructure operation Poster

Speaker

Dr Iosif Legrand (CALTECH)

Description

The MonaLISA (Monitoring Agents in A Large Integrated Services Architecture) system provides a distributed service for monitoring, control and global optimization of complex grid systems and networks for high energy physics, and many other fields of data-intensive science. It is based on an ensemble of autonomous multi-threaded, agent-based subsystems which are registered as dynamic services and are able to collaborate and cooperate in performing a wide range of monitoring and decision tasks in large scale distributed applications. MonALISA’s services also are able to be discovered and used by other services or clients that require such information. An essential part of managing global-scale systems such as grids is a monitoring system that is able to monitor and track the many site facilities, networks, and tasks in progress, all in real time. The monitoring information gathered also is essential for developing the required higher level services, and components of the Grid system that provide decision support, and some degree of automated decisions, to help maintain and optimize workflow through the Grid. The MonALISA software architecture simplifies the construction, operation and administration of complex systems by: (1) allowing registered services to interact in a dynamic and robust way; (2) allowing the system to adapt when devices or services are added or removed, with no user intervention; (3) providing mechanisms for services to register and describe themselves, so that services can intercommunicate and use other services without prior knowledge of the services' detailed implementation. MonALISA’s flexible access to any monitoring information, and its support for alarm triggers and agents able to take immediate actions in response to abnormal system behavior, are being used to help manage and improve the working efficiency of the site facilities, and the overall Grid system being monitored. These management and global optimization functions are performed by higher level agent-based services. Current applications of MonALISA’s higher level services include optimized dynamic routing of different types of applications, and distributed job scheduling, among a large set of grid facilities. MonALISA is currently used around the clock in several major grid projects, and has proven to be both highly scalable and reliable. More than 250 services are running at sites around the world, collecting information about computing facilities (more than 15 000 nodes), local and wide area network traffic, and the state and progress of the many thousands of grid jobs executing at any one time. It is also used to collect accounting information in grid systems.

Primary author

Dr Iosif Legrand (CALTECH)

Co-authors

Adrian Muraru (Polytechnic University of Bucharest) Alexandru Costan (Polytechnic University of Bucharest) Catalin Cirstoiu (CERN) Ciprian Dobre (Polytechnic University of Bucharest) Costin Grigoras (Polytechnic University of Bucharest) Prof. Harvey Newman (CALTECH) Lucian Musat (Polytechnic University of Bucharest) Mihaela Toarta (Polytechnic University of Bucharest)

Presentation materials

There are no materials yet.