9–11 May 2007
Manchester, United Kingdom
Europe/Zurich timezone

The gLite Workload Management System

10 May 2007, 09:00
20m
Manchester, United Kingdom

Manchester, United Kingdom

oral presentation Workflow Workflow

Speaker

Mr Alessandro Maraschini (DATAMAT)

With a forward look to future evolution, discuss the issues you have encountered (or that you expect) in using the EGEE infrastructure. Wherever possible, point out the experience limitations (both in terms of existing services or missing functionality)

The future developments of the gLite WMS will be focused on
improving its portability
and usability. Dependencies on external software will be reduced
on the client part
(User Interface) to improve portability. Improved logging and
error reporting as well
as an improved monitoring system will make easier to maintain and
use the service.
The reduction of the resources needed by the WMS processes will
increase the
stability and throughput.

Describe the added value of the Grid for the scientific/technical activity you (plan to) do on the Grid. This should include the scale of the activity and of the potential user community and the relevance for other scientific or business applications

The gLite Workload Management System (WMS) is a collection of
components providing a
service responsible for the distribution and management of tasks
across resources
available on a Grid. The main purpose is to accept a request of
execution of a job
from a client, find appropriate resources to satisfy it and
follow it until completion.
Several kinds of jobs (or aggregate of jobs) are allowed:
Normal - simple batch job
MPICH - a parallel application to be run on the nodes of a
cluster using the MPICH
implementation
interactive - job whose standard streams are forwarded to the
submitting client in
order to let him interact
Directed Acyclic Graphs (DAGs) - a set of jobs where the input,
output or execution
of one of more jobs may depend on one or more other jobs
parametric - allows submission of a large number of jobs by
specifying a
parameterized description
collection - a possibly big number of independent jobs that can
be specified within a
single JDL description

Report on the experience (or the proposed activity). It would be very important to mention key services which are essential for the success of your activity on the EGEE infrastructure.

Intense testing and constant bug fixing activities have been
performed over the last
months in order to improve job submission rate and service
stability. Several new
functionalities were tested and adopted so far (20% faster
submission achieved)
WMProxy user front-end implements an interoperable WEB service
interface, allowing
users either to implement its own language-independent client, or
to adopt the multi
language (C++, Java & Python) provided APIs, or to use the C++
based WMS command-line
User Interface.
Integration of the Service Discovery functionality within the UI
provides the user
with a new set of possible service endpoints by performing
queries to external
databases without needing the manual reconfiguration
Automatic job’s sandbox files archiving and compression, along
with the possibility
to make jobs share the same sandbox, dramatically reduced network
traffic
Job Perusal functionality allows users to monitor actuoal output
files producing
during job lifecycle

Describe the scientific/technical community and the scientific/technical activity using (planning to use) the EGEE infrastructure. A high-level description is needed (neither a detailed specialist report nor a list of references).

The EGEE-II Joint Research Activity 1 (JRA1) is responsible for
the re-engineering of
the gLite middleware. One of the fundamental functionalities
provided is a system for
the submission of job and the management of the workload. The
institutes involved in
this activity are INFN, Datamat and CESNET. The challenge and
originality of the
activity lies in selecting, potentially re-engineering and
integrating a set of
reliable production-quality services

Authors

Mr Alessandro Maraschini (DATAMAT) Mr Fabrizio Pacini (DATAMAT) Mr Francesco Giacomini (INFN)

Co-authors

Mr Marco Pappalardo (INFN) Mr Massimo Sgaravatto (INFN) Mr Salvo Monforte (INFN)

Presentation materials