Preparation for WLCG production from a Tier-1 viewpoint

The GRIDPP Tier-1 Centre at RAL is one of 10 Tier-1 centres worldwide preparing for the start of LHC data taking in late 2007. The RAL Tier-1 is expected to provide a reliable grid-based computing service running thousands of simultaneous batch jobs with access to a multi-petabyte CASTOR-managed disk storage pool and tape silo, and will support the ATLAS, CMS and LHCb experiments as well as many other experiments already taking or analysing data. The RAL Tier-1 is already well advanced towards readiness for LHC data-taking. We describe some of the reliability and performance issues encountered with various generations of storage hardware in use at RAL and how the problems were addressed. We describe the networking challenges for shipping late volumes of data into and out of the Tier-1 storage systems, and system to system within the Tier-1, and the changes made to accommodate the expected data volumes. We describe the scalability and reliability issues encountered with the grid-services and the various strategies used to minimise the impact of problems, including multiplying the number of service hosts, splitting services across a number of hosts, and upgrading services to more resilient hardware.

Primary authors

Dr Andrew Sansum (STFC/RAL) Mr Martin Bly (STFC/RAL)

