Mr Simon Liu (TRIUMF)
We describe in this paper the design and implementation of Tapeguy, a high performance non-proprietary Hierarchical Storage Management System (HSM) which is interfaced to dCache for efficient tertiary storage operations. The system has been successfully implemented at the canadian Tier-1 Centre at TRIUMF. The ATLAS experiment will collect a very large amount of data (approximately 3.5 Petabytes each year). An efficient HSM system will play a crucial role in the success of the ATLAS Computing Model which is driven by intensive large-scale data analysis activities that will be performed on the Worldwide LHC Computing Grid infrastructure around the clock. Tapeguy is perl-based. It controls and manages data and tape libraries. Its architecture is scalable and includes Dataset Writing control, a Readback Queuing mechanism and I/O tape drive load balancing as well as on-demand allocation of resources. A central MySQL database records metadata information for every file and transaction (for audit and performance evaluation), as well as an inventory of library elements. Tapeguy Dataset Writing was implemented to group files which are close in time and of similar type. Optional dataset path control dynamically allocates tape families and assign tapes to it. Tape flushing is based on various strategies: time, threshold or external callbacks mechanisms. Tapeguy Readback Queuing reorders all read requests by using a 'scan algothrim', avoiding unnecessary tape loading and unloading. Implementation of priorities will guarantee file delivery to all clients in a timely manner.
|Presentation type (oral | poster)||oral|