Dr Graeme Stewart (University of Glasgow)
When operational, the Large Hadron Collider experiments at CERN will collect tens of petabytes of physics data per year. The worldwide LHC computing grid (WLCG) will distribute this data to over two hundred Tier-1 and Tier-2 computing centres, enabling particle physicists around the globe to access the data for analysis. Different middleware solutions exist for effective management of storage systems at collaborating institutes. Two of these have been widely deployed at Tier-2 sites: the Disk Pool Manager (DPM) from EGEE and dCache, a joint project between DESY and FNAL. Two distinct access patterns are envisaged for these systems. The first involves bulk transfer of data between different Grid storage elements using protocols such as GridFTP. The second relates to how physics analysis jobs will read the data while running on Grid computing resources. Such jobs require a POSIX-like interface to the storage so that individual physics events can be extracted. Both DPM and dCache have their own protocols for POSIX access (rfio and gsidcap respectively) and it is essential that these scale with the available computing resources in order to meet the demands of physics analysis in the LHC era. In this paper we study the performance of these protocols as a function of the number of clients that are simultaneously reading data from the storage. We investigate server kernel tuning so as to optimise the performance of LAN access. We also consider the performance of these protocols from the point of view of real ATLAS and CMS analysis jobs.