Simon William Fayer (Imperial College Sci., Tech. & Med. (GB)) Stuart Wakefield (Imperial College Sci., Tech. & Med. (GB))
Reading and writing data onto a disk based high capacity storage system has long been a troublesome task. While disks handle sequential reads and writes well, when they are interleaved performance drops off rapidly due to the time required to move the disk's read-write head(s) to a different position. An obvious solution to this problem is to replace the disks with an alternative storage technology such as solid-state devices which have no such mechanical limitations, however in most applications this is prohibitively expensive. This problem is commonly seen at computer facilities where new data need to be stored while old data are being processed from the same storage, such as WLCG grid sites. In the WLCG case this problem is only going to become more prominent as the LHC luminosity increases, creating larger data-sets. In this paper we explore the possibility of introducing a fast write cache in-front of the storage system to buffer inbound data. This cache allows writes to be coalesced into larger, more efficient blocks before being committed to the primary storage, while also allowing this action to be postponed until the primary storage is sufficiently quiescent. We demonstrate that this is a viable solution to the problem using a real WLCG site as an example of deployment. Finally we also discuss the steps required to tune the surrounding infrastructure, such as the computer network and storage meta-data server in order to sustain high write rates to the cache and allow for the data to be flushed to the bulk storage successfully.