The STFC CASTOR tape service is responsible for the management of over 80PB of data including 45PB generated by the LHC experiments for the RAL Tier-1. In the last few years there have been several disruptive changes that have or are necessitating significant changes to the service. At the end of 2016, Oracle, which provided the tape libraries, drives and media announced they were leaving the tape market. In 2017, the Echo (Tier-1 disk) storage service entered production and disk only storage migrated away from CASTOR. In 2017, CERN, which provides support for CASTOR, started to test their replacement to CASTOR called CTA.
Since October 2018, a new shared CASTOR instance has been in production. This instance is a major simplification from the previous four. In this paper we describe the setup and performance of this instance which includes two sets of failure-tolerant management nodes that ensure improved reliability and a single unified tape cache that has displayed increased access rates to tape data compared to previous separate tape cache pools.
In March 2019, a new Spectra Logic Tape robot was delivered to RAL. This uses both LTO and IBM media. This paper will describe the tests that were carried out on this system, which includes multiple sets of dense and sparse tape reads to assess the throughput performance of the library for various use cases.
Finally, this paper will describe the ongoing work exploring possible new, non-SRM tape management systems that will eventually replace CASTOR.
|Consider for promotion||No|