Speaker
E. Efstathiadis
(BROOKHAVEN NATIONAL LABORATORY)
Description
As a PPDG cross-team joint project, we proposed to study, develop,
implement and evaluate a set of tools that allow Meta-Schedulers to
take advantage of consistent information (such as information needed
for complex decision making mechanisms) across both local and/or Grid
Resource Management Systems (RMS).
We will present and define the requirements and schema by which one
can consistently provide queue attributes for the most common batch
systems (PBS, LSF, Condor, SGE, etc). We evaluate the best scalable
and lightweight approach to access the monitored parameters from a
client perspective and, in particular, the feasibility of
accessing real-time and aggregate information using the MonaLISA
monitoring framework. Client programs are envisioned to function in a
non-centralized, fault
tolerant fashion. Inherent delays as well as scalability issues of
each approach (implementing it at a large number of sites) will be
discussed.
The MonALISA monitoring framework, being an ensemble of autonomous
multi-threaded, agent based systems which are registered as dynamic
services and are able to collaborate and cooperate in performing a
wide range of monitoring tasks in a large scale distributed
applications, is a natural choice for such a project. MonALISA is
designed to easily integrate existing monitoring tools and procedures
and provide information in a dynamic self-describing way to any other
service or client. We intend to demonstrate the usefulness of this
consistent approach for queue monitoring by implementing a monitoring
agent within the STAR Unified Meta-Scheduler (SUMS) framework.
We believe that such developments could highly benefit Grid
laboratory efforts such as the Grid3+ and the OpenScience Grid (OSG).
Primary authors
E. Efstathiadis
(BROOKHAVEN NATIONAL LABORATORY)
I. Legrand
(California Institute of Technology)
L Hajdu
(BNL)