Speaker
Prof.
Ludek Matyska
(CESNET, z.s.p.o.)
Description
Logging and Bookkeeping (LB) service is responsible for keeping track of jobs
within a complex Grid environment. Without such a service, users are
unable to find out what happened with their lost jobs and Grid administrators
are not able to improve the infrastructure. The LB service developed
within the EGEE project provides a distributed scalable solution able to
deal with hundreds thousands of jobs on large Grids. However, to provide
the necessary scalability and not to slow down the processing of jobs
within a middleware, it is based on a non-blocking asynchronous model.
This means that the order of events sent to LB by individual parts of
the middleware (user interface, scheduler, computing element, ...) is not
guaranteed. While dealing with such out of order events, the LB may
provide information that looks inconsistent with the knowledge user has
from some other source (e.g. he got independent notification about the
job state). The lecture will reveal LB internal design and we will
discuss how the LB results (i.e. the job state) should be interpreted.
While LB is dealing with active jobs only, Job Provenance (JP) is
designed to store indefinitely information about all jobs that run on a
Grid. All the relevant information needed to re-submit the job in the
same environment is stored, including computing environment
specification. Users can annotate stored records, providing yet another
metadata layer useful e.g. for job grouping and data mining over the JP.
We will provide basic information about the JP and its use, looking for a
feedback for its improvement.
Authors
Dr
Ales Krenek
(CESNET, z.s.p.o.)
Prof.
Ludek Matyska
(CESNET, z.s.p.o.)
Co-authors
Mr
Daniel Kouril
(CESNET, z.s.p.o.)
Mr
Jan Pospisil
(CESNET, z.s.p.o.)
Mr
Jiri Sitera
(CESNET, z.s.p.o.)
Mr
Michal Vocu
(CESNET.z.s.p.o.)
Mr
Milos Mulac
(CESNET, z.s.p.o.)
Mr
Miroslav Ruda
(CESNET, z.s.p.o.)
Mr
Zdenek Salvet
(CESNET, z.s.p.o.)