Agenda for the storage group EVO Meeting 29th July 2008 ======================================================= Present: Greig Cowan (chair and minutes) Brian Davies John Bland Matt Doidge Winnie Lacesso Ewan McMahon Elena Korolkova Peter Love Apologies: Jens Jensen 0. Review of actions See below. 1. Site round-up. What problems have you seen in the last week? - http://www.gridpp.ac.uk/wiki/GridPP_storage_availability_monitoring Oxford: DPM draining 3 production pools onto other empty pools which have the new back planes (replaced by Viglen). dpm-drain is incredibly slow. 2TB of data has take almost 1 week to move from the 3 servers to the 3 new servers. Once complete, Viglan will return to replace the back planes on the remaining machines. GC notes that we understand why dpm-drain is so slow: 1. rfio used to perform the transfer, so every transfer required the GSI handshake. 2. connection to DB times out after some short length of time, meaning that dpm-drain has to be constantly restarted. Once this work is complete, Oxford will have ~90TB usable storage. Apart from this, no other sites had anything to report. There did not appear to be any other major problems (other than the certificates, see below). 2. Space tokens BD will circulate details later today concerning the allocations that ATLAS as expecting for the ATLASPRODDISK space token. They will be doing this in units of 0.5TB. Size will be site dependent. Thanks to all sites who have already enabled this. You will be able to use dpm-updatespace to increase the size of the token. Remember to specify the lifetime that you want when updating (since there is a bug in dpm-updatespace). ACTION: Brian to circulate space token details to sites. 3. Certificates UCL-Central, UCL-HEP, Cambridge and Durham were all hit by the certificate upgrade process last week. Sites need to remember that when using DPM, the head node certificates have to be updated at /etc/grid-security and copies (with correct names and permissions) made at /etc/grid-security/dpmmgr. YAIM will do this for you. 4. DPM-xrootd GC reported on his ongoing investigations in using xroot to access files on DPM. This is done with the production system at Edinburgh and some of GC analysis jobs (therefore it applies to real-world usage). Problem with support for >2GB files fixed. Have now stumbled across what appears to be a memory leak in the xroot client libraries. The same analysis job completes successfully when using rfio to access files, but is killed off by the Edinburgh batch system when using xroot for the data access. GC is in contact with the developers to try and understand the root of the problem. 5. New version of DPM admin toolkit available v1.2.0-4 now available in sys-man yum repository. This contains new tool: dpm-listspaces from Michel Jouvin. Replacement for dpm-qryconf and also can be used with --gip option to act as the new DPM information provider. Will eventually move into the DPM release. Sites should familiarise themselves with it's usage. Sites have installed this and are using it. 6. AOB BD reported problems at the Tier-1 due to CASTOR DB upgrade. FTS for ATLAS not working. T2 sites will be seeing build up of files. This came about as part of the move to using Oracle RAC as backend. LHCb move went fine. Database for ATLAS much larger compared to LHCb. T2s should see the usage of ATLASMCDISK increasing. Soon ATLAS will move over to ATLASPRODDISK. ======================================================================== ACTIONS Actions (correct list this time): 237 17/10/2007 Test and stress test DPM on Lustre Greig/Andrew Low Open 247 12/12/2007 Circulate "usable storage" for discussion Jens Med Open 263 6/2/2008 Investigate publishing role acbrs for CASTOR Jens Low Open 267 6/2/2008 Blog item about SRM2 (protocol) work Jens Med Open 276 5/2/2008 Further benchmarking tests to compare performance of xfs Andrew/Greig Low Open 277 16/7/2008 Peter to update wiki with Lancaster's success story Closed. 278 16/7/2008 Circulate Andy Pickford's notes to mailing list (this was done) Closed NEW ACTION ---------- 279 30/7/2008 Brian to circulate space token details to sites. Open