Minutes of the storage EVO conf, 18 June 2008 Present: Glasgow: Andrew Lancaster: Matt, Peter Edinburgh: Greig Bristol: Winnie Oxford: Ewan Sheffield: Elena Imperial: Duncan RAL Storage: Jens (chair+mins) Apologies: RAL Storage: Brian 0. No actions review this week, again. 1. There is a push to change the interpretation of the GLUE schema in WLCG, to ensure that "all" storage resources are published and not just those reachable via space token descriptions. The discussion is currently taking place on a closed list, with dCache and CASTOR represented, not sure who is representing DPM; apparently nobody representing the other implementations. Jens happens to be on this list representing the CASTOR information provider, so will keep the group informed, as well as the NGS people working on the accounting. Jens is continuing to digest the information (still early days) and try to figure out what it means for CASTOR. 2. At the CCRC review last week, UK looked good. A few software and OPN glitches, but generally OK. dCache had a large number of versions, DPM had a more stable recommended version. Seemed to go OK with both implementations though. There was a problem with GSIDCAP which always seemed to pick the highest available interface for multihomed hosts. There was a question about CASTOR to StoRM transfers, regarding block sizes. There was also a CASTOR meeting last week, and the only relevant issue we know of from this meeting is that there is a push to have GSIRFIO for CASTOR at CERN. RAL is unlikely to deploy this version. Greig suggested CASTOR RFIO with GSI would be compatible with DPM's. 3. Software issues dCache at Manchester. Brian had suggested running dCache with half the pools and do replication outside dCache to the other ones. Andrew suggested using iSCSI with multipath between machines - dCache would see a single instance of the file but it would be replicated. The file would still be transferred through a single door so there would be no benefit in the redundancy at the dCache level. DPM (pools) on SL5. This would be good for Lancaster, because driver support for new hardware looks bettter in SL5. However, currently there is no support for DPM on SL5. Portability problems are believed to be with library versions and GSI/VOMS. While there has been suggestions to move away from VDT, and use a core OpenSSL, OpenSSL support for GSI proxies has to be set at compile time and usually is not in the OS version [however, perhaps it could be in SL.] We'll have to keep watching this. Bristol (Jon) provided a wealth of information for Lancaster about storage purchasing; it is still relatively specialised so Peter is also interested in what other people have. Who has the biggest storage per disk server? Glasgow had ~20TB. Oxford had loosely specified it in the tender, aiming for about 10 TB. Andrew suggests talking to Mike Kenyon if you're attending the hepsysman. We discussed how to best share this information and how much could be made public in general. Whoever did it last is considered to be the expert. In general, the best way to share the information is to raise it at the meeting and then share the information by private email: thus, the information is available to anyone interested, but does not go public and does not expose potentially sensitive information. Actions - we should review them soon. 193 7/3/2007 Document RFIO testing in Wiki for (DPM) site metric Greig Med Open 215 27/6/2007 Report on DPM on Lustre Greig Med Open 237 17/10/2007 Test and stress test DPM on Lustre Greig/Andrew Med Open 247 12/12/2007 Circulate "usable storage" for discussion Jens Med Open 251 9/1/2008 Report on Italian chap RFIO stress testing Greig Low Open 260 30/1/2008 Follow up with LondonT2 then mgmt re QMUL Jens Med Open 261 30/1/2008 Follow up with Sergey re Manchester upgrade Greig Med Open 262 6/2/2008 Document gSOAP and CGSI problem Jens Med Open 263 6/2/2008 Investigate publishing role acbrs for CASTOR Jens Med Open 267 6/2/2008 Blog item about SRM2 (protocol) work Jens Med Open 272 27/2/2008 Investigate displaying SRM versions on monitoring page Greig Med Open 273 27/2/2008 Forward details on how to publish 2 close SEs to Duncan Greig Med Open 274 5/2/2008 Find out why dpm-drain is so slow Greig Low Open 275 5/2/2008 investigate DPM database cleaning Greig Low Open 276 5/2/2008 Further benchmarking tests to compare performance of xfs Andrew/Greig Low Open 278 26/3/2008 Run job timeout tests against dCache (and others..?) Greig Med Open 280 26/3/2008 Document in wiki how to check for draining DPM Andrew Med Open 282 2/4/2008 Raise SL4 JFS (non)support at HEPiX Greig Med Open 283 2/4/2008 Report on GLUE storage work Jens Med Open 284 16/4/2008 Put hardware recommendations in wiki Greig Med Open 285 16/4/2008 Send information about Köln workshop to list Greg High Open 286 16/4/2008 Ponder how to make use of hardware people expertise ALL Low Open 287 14/5/2008 Send metrics information to list Jens/Andrew Med Open Chat window 09:53:31] Andrew Elwell joined [09:55:55] Matthew Doidge joined [09:57:19] Ewan Mac Mahon joined [09:57:39] Peter Love joined [09:57:51] Andrew Elwell Hi folks - grabbing a coffee - biab [09:58:12] Jens Jensen Coffee - mmmm [10:02:35] Winnie Lacesso joined [10:02:45] Duncan Rand joined [10:04:10] Greig Cowan joined [10:04:11] Winnie Lacesso Jens, can you repeat that last bit about accounting please? [10:04:26] Greig Cowan hi all [10:04:51] Elena Korolkova joined [10:05:36] Andrew Elwell nup - I assume the accounting will trickle into APEL / storage accounting pages eventually? [10:11:03] Andrew Elwell http://indico.cern.ch/conferenceOtherViews.py?view=standard&confId=23563 [10:12:48] Andrew Elwell http://gridpp-ops.blogspot.com/2008/06/ccrc-post-mortem-workshop-day1-am.html [10:22:19] Andrew Elwell *but* you've gotta hack around with iscsi too [10:26:05] Andrew Elwell yeah - don\ [10:26:23] Andrew Elwell don't hold your breath was the conclusion [10:28:13] Ewan Mac Mahon We're got about the same sort of setup at Oxford. [10:28:46] Andrew Elwell we've just prepared the glasgow tender docs - we're just asking for disk servers and compute nodes no servers or installation tools [10:33:07] Andrew Elwell left [10:33:14] Elena Korolkova left [10:33:17] Peter Love left [10:33:18] Winnie Lacesso left [10:33:19] Greig Cowan left [10:33:22] Duncan Rand left [10:33:30] Ewan Mac Mahon left