Speaker
Dr
Graeme Stewart
(University of Glasgow)
Description
When operational, the Large Hadron Collider experiments at CERN will
collect tens of petabytes of physics data per year. The worldwide LHC
computing grid (WLCG) will distribute this data to over two hundred
Tier-1 and Tier-2 computing centres, enabling particle physicists
around the globe to access the data for analysis. Different middleware
solutions exist for effective management of storage systems at
collaborating institutes. Two of these have been widely deployed at
Tier-2 sites: the Disk Pool Manager (DPM) from EGEE and dCache, a
joint project between DESY and FNAL. Two distinct access patterns are
envisaged for these systems. The first involves bulk transfer of data
between different Grid storage elements using protocols such as
GridFTP. The second relates to how physics analysis jobs will read the
data while running on Grid computing resources. Such jobs require a
POSIX-like interface to the storage so that individual physics events
can be extracted. Both DPM and dCache have their own protocols for
POSIX access (rfio and gsidcap respectively) and it is essential that
these scale with the available computing resources in order to meet
the demands of physics analysis in the LHC era. In this paper we study
the performance of these protocols as a function of the number of
clients that are simultaneously reading data from the storage. We
investigate server kernel tuning so as to optimise the performance of
LAN access. We also consider the performance of these protocols from
the point of view of real ATLAS and CMS analysis jobs.
Primary authors
Mr
Andrew Elwell
(University of Glasgow)
Dr
Graeme Stewart
(University of Glasgow)
Dr
Greig Cowan
(University of Edinburgh)
Dr
Paul Millar
(University of Glasgow)