Speaker
Dr
Graeme A Stewart
(University of Glasgow)
Description
Data management has proved to be one of the hardest jobs to do in a the grid
environment. In particular, file replication has suffered problems of transport
failures, client disconnections, duplication of current transfers and resultant
server saturation.
To address these problems the globus and gLite grid middlewares offer new services
which improve the resiliancy and robustness of file replication on the grid. gLite
has the File Transfer Service (FTS) and globus offers Reliable File Transfer (RFT).
Both of these middleware components offer clients a web service interface to which
they can submit a request to copy a file from one grid storage element to another.
Clients can then return to the web service to query the status of their requested
transfer, while the services can schedule, load balance and retry failures between
the recieved requests.
In this paper we compare these two services, examining,
a) Architecture and features offered to clients and grid infrastructure providers.
b) Robustness under load: e.g., when large numbers of clients attempt to connect in a
short time or large numbers of transfers are scheduled at once.
c) Behaviour under common failure conditions - loss of network connectivity, failure
of backend database, sudden client disconnections.
Lessons learned in the deployment of gLite FTS during LCG Service Challenge 3 are
also discussed.
Finally, further development of higher level data management services, including
interaction with catalogs in gLite File Placement Service and Globus Data Replication
Service is considered.
Author
Dr
Graeme A Stewart
(University of Glasgow)
Co-author
Dr
Gavin McCance
(CERN)