Speaker
Dr
Iosif Legrand
(CALTECH)
Description
The MonaLISA (Monitoring Agents in A Large Integrated Services Architecture) system
provides a distributed service for monitoring, control and global optimization of
complex grid systems and networks for high energy physics, and many other fields of
data-intensive science. It is based on an ensemble of autonomous multi-threaded,
agent-based subsystems which are registered as dynamic services and are able to
collaborate and cooperate in performing a wide range of monitoring and decision tasks
in large scale distributed applications. MonALISA’s services also are able to be
discovered and used by other services or clients that require such information.
An essential part of managing global-scale systems such as grids is a monitoring
system that is able to monitor and track the many site facilities, networks, and
tasks in progress, all in real time. The monitoring information gathered also is
essential for developing the required higher level services, and components of the
Grid system that provide decision support, and some degree of automated decisions, to
help maintain and optimize workflow through the Grid.
The MonALISA software architecture simplifies the construction, operation and
administration of complex systems by: (1) allowing registered services to interact in
a dynamic and robust way; (2) allowing the system to adapt when devices or services
are added or removed, with no user intervention; (3) providing mechanisms for
services to register and describe themselves, so that services can intercommunicate
and use other services without prior knowledge of the services' detailed implementation.
MonALISA’s flexible access to any monitoring information, and its support for alarm
triggers and agents able to take immediate actions in response to abnormal system
behavior, are being used to help manage and improve the working efficiency of the
site facilities, and the overall Grid system being monitored. These management and
global optimization functions are performed by higher level agent-based services.
Current applications of MonALISA’s higher level services include optimized dynamic
routing of different types of applications, and distributed job scheduling, among a
large set of grid facilities. MonALISA is currently used around the clock in several
major grid projects, and has proven to be both highly scalable and reliable.
More than 250 services are running at sites around the world, collecting information
about computing facilities (more than 15 000 nodes), local and wide area network
traffic, and the state and progress of the many thousands of grid jobs executing at
any one time. It is also used to collect accounting information in grid systems.
Primary author
Dr
Iosif Legrand
(CALTECH)
Co-authors
Adrian Muraru
(Polytechnic University of Bucharest)
Alexandru Costan
(Polytechnic University of Bucharest)
Catalin Cirstoiu
(CERN)
Ciprian Dobre
(Polytechnic University of Bucharest)
Costin Grigoras
(Polytechnic University of Bucharest)
Prof.
Harvey Newman
(CALTECH)
Lucian Musat
(Polytechnic University of Bucharest)
Mihaela Toarta
(Polytechnic University of Bucharest)