Speaker
A. Anjum
(NIIT)
Description
In the context of Interactive Grid-Enabled Analysis Environment
(GAE), physicists desire bi-directional interaction with the job
they submitted. In one direction, monitoring information about the
job and hence a “progress bar” should be provided to them. On other
direction, physicist should be able to control their jobs. Before
submission, they may direct the job to some specified resource or
computing element. Before execution, its parameter may be changed or
it may be moved to another location. During execution, its
intermediate results should be fetched or it may be moved to another
location. Also, physicists should be able to kill, restart, hold and
resume their jobs.
Interactive job execution requires that at each step, the user must
make choices between alternative application components, files, or
locations. So a dead end may be reached where no solution can be
found, which would require backtracking to undo some previous
choice. Another desire is reliable and optimal execution of the job.
Grid should take some decisions regarding the job execution to help
in reliable and optimal execution of the job. Reliability can be
achieved using the job recovery mechanism. When a job on grid fails,
the recovery mechanism should resubmit the job on either the same
resource or on different resource. Check-pointing the job will
make resource utilization low when recovering the job from failure.
In this paper the architecture and design of an autonomous grid
service is described that fulfills the above stated requirements for
interactivity in Grid-enabled data analysis.
Primary authors
A. Ali
(NIIT)
A. Anjum
(NIIT)
C. Steenberg
(CALIFORNIA INSTITUTE OF TECHNOLOGY)
F. van Lingen
(CALIFORNIA INSTITUTE OF TECHNOLOGY)
H. Newman
(Caltech)
I. Willers
(CERN)
J. Bunn
(CALTECH)
M. A. Zafar
(NIIT)
M. Thomas
(CALIFORNIA INSTITUTE OF TECHNOLOGY)
R. Cavanaugh
(University of Florida)
R. Mcclatchey
(UWE)