Minutes of the storage phone (well, EVO) conf, 13 Aug 2008 Present: Edinburgh: Greig Lancaster: Matt Oxford: Ewan Liverpool: John Sheffield: Elena Bristol: Winnie RAL Storage: Brian, Jens (chair+mins) -1. On behalf of GridPP, Jens thanked the group for work done in GridPP2+, referring to email sent by Dave Britton. 0. Actions were discussed. Apparently synchronisation has once again confused the spreadsheet? See actions list below. Investigate keeping actions in Savannah, so everyone can edit them (ACTION Jens). 2. SE scanning incident update. Winnie had reported ceasefire but what about other sites? Sites should report whether they are still seeing incidents (ACTION). 3. Upcoming HEPiX quick discussion Apart from Greig's work with the filesystems subgroup, there is no other work planned for HEPiX. Also the deadline for submission of abstracts is now passed. 4. Other issues - round table Edinburgh: shutting down dCache, finally. Run DPM with xrootd support. Lancaster: SCSI problems. Need to upgrade pool nodes but they have no floppy. Oxford: All quiet on the Grid. LOST AUDIO - it went quiet on EVO, too. Jens and Brian could hear each other which is not trivial with Jens not at RAL. Bristol: CMS filling up all space. Issues like Atlas, not deleted properly - "proper" deletion should free space. As we had lost audio (and were running out of time so no time to reconnect), we had to close the meeting. 5. Hardware corner? 6. AOB Actions 193 07/03/2007 Document RFIO testing in Wiki for (DPM) site metric Greig Med Open Reassigned Graeme->Greig 11.07.07 215 27/06/2007 Report on DPM on Lustre Greig Med Open 237 17/10/2007 Test and stress test DPM on Lustre Greig/Andrew Low Open 247 12/12/2007 Circulate "usable storage" for discussion Jens Med Open This was closed? 251 09/01/2008 Report on Italian chap RFIO stress testing Greig Low Open That was closed. 261 30/01/2008 Follow up with Sergey re Manchester upgrade Greig Med Open Ongoing 263 06/02/2008 Investigate publishing role acbrs for CASTOR Jens Low Open No news. 267 06/02/2008 Blog item about SRM2 (protocol) work Jens Med Open 272 27/02/2008 Investigate displaying SRM versions on monitoring page Greig Med Open This was closed? 273 27/02/2008 Forward details on how to publish 2 close SEs to Duncan Greig Med Open This one was closed. 274 05/02/2008 Find out why dpm-drain is so slow Greig Low Open Also closed? 275 05/02/2008 investigate DPM database cleaning Greig Low Open Closed. 276 05/02/2008 Further benchmarking tests to compare performance of xfs Andrew/Greig Low Open 278 26/03/2008 Run job timeout tests against dCache (and others..?) Greig Med Open 282 02/04/2008 Raise SL4 JFS (non)support at HEPiX Greig Med Open 284 16/04/2008 Put hardware recommendations in wiki Greig Med Open 289 09/07/2008 Report how to monitor disks (RAID) - to tbsupport Peter/John Med Open An action related to this one was closed. 291 16/07/2008 Update dual homing information for DPM on wiki Peter Med Open Closed? 292 30/07/2008 Brian to circulate space token details to sites. Brian Med Open This was done, closed. 293 06/08/2008 Matt to email TB-SUPPORT about SCSI problems. Matt Med Open Also done, closed. 294 06/08/2008 investigate discrepancy in space token monitoring tools. Greig Med Open There is a problem with the dCache information provider that it does not publish the GlueChunkKey correctly - thus, it makes it hard to match the sites, VOs, and SEs together. The GlueChunkKey is the GLUE gadget that links objects together, like a pointer. 295 06/08/2008 contact sites to ensure they are set up properly prior to tomorrows ATLAS 100% data transfer Brian High Open Space tokens are missing or are not set up (see also chat log). The first YES/NO in the list is whether the token is reported by Greig's script, and the second is whether it appears in Greig's monitoring. Monitoring should now be updated to make it consistent. The tools have access only within the site. The second step is to check whether enough resources are allocated for each space token. Greig is working on updated monitoring tools. CHAT LOG [09:12:31] Greig Cowan joined [09:13:27] Greig Cowan hello? [09:47:13] Matthew Doidge joined [09:56:32] Ewan Mac Mahon joined [09:57:07] Brian Davies joined [09:57:36] John Bland joined [09:58:42] Elena Korolkova joined [09:59:21] Winnie Lacesso joined [10:04:18] Ewan Mac Mahon So do we add an action for doing that? [10:05:38] Winnie Lacesso But no real resolution yet to the SCSI problems. [10:07:44] Brian Davies UKI-SOUTHGRID-RALPP SOUTH NO YES PRODDISK DCACHE UKI-SOUTHGRID-RALPP SOUTH NO YES MCDISK DCACHE UKI-SOUTHGRID-RALPP SOUTH NO YES DATADISK DCACHE UKI-SCOTGRID-ECDF SCOT NO YES DATADISK DCACHE UKI-NORTHGRID-LIV-HEP NORTH NO YES USERDISK DCACHE UKI-NORTHGRID-LIV-HEP NORTH NO YES PRODDISK DCACHE UKI-NORTHGRID-LIV-HEP NORTH NO YES MCDISK DCACHE UKI-NORTHGRID-LIV-HEP NORTH NO YES LOCALGROUP DCACHE UKI-NORTHGRID-LIV-HEP NORTH NO YES GROUP DCACHE UKI-NORTHGRID-LIV-HEP NORTH NO YES DATADISK DCACHE UKI-LT2-RHUL LT2 NO YES PRODDISK DPM UKI-LT2-RHUL LT2 NO YES MCDISK DPM UKI-LT2-RHUL LT2 NO YES DATADISK DPM UKI-SOUTHGRID-RALPP SOUTH NO NO USERDISK DCACHE UKI-SOUTHGRID-RALPP SOUTH NO NO LOCALGROUP DCACHE UKI-SOUTHGRID-RALPP SOUTH NO NO GROUP DCACHE UKI-SOUTHGRID-OX-HEP SOUTH NO NO USERDISK DPM UKI-SOUTHGRID-OX-HEP SOUTH NO NO LOCALGROUP DPM UKI-SOUTHGRID-OX-HEP SOUTH NO NO GROUP DPM UKI-SCOTGRID-ECDF SCOT NO NO PRODDISK DCACHE UKI-NORTHGRID-MAN-HEP NORTH NO NO USERDISK DCACHE UKI-NORTHGRID-MAN-HEP NORTH NO NO PRODDISK DCACHE UKI-NORTHGRID-MAN-HEP NORTH NO NO MCDISK DCACHE UKI-NORTHGRID-MAN-HEP NORTH NO NO LOCALGROUP DCACHE UKI-NORTHGRID-MAN-HEP NORTH NO NO GROUP DCACHE UKI-NORTHGRID-MAN-HEP NORTH NO NO DATADISK DCACHE UKI-LT2-RHUL LT2 NO NO USERDISK DPM UKI-LT2-RHUL LT2 NO NO LOCALGROUP DPM UKI-LT2-RHUL LT2 NO NO GROUP DPM UKI-LT2-QMUL LT2 NO NO USERDISK DPM UKI-LT2-QMUL LT2 NO NO PRODDISK DPM UKI-LT2-QMUL LT2 NO NO MCDISK DPM UKI-LT2-QMUL LT2 NO NO LOCALGROUP DPM UKI-LT2-QMUL LT2 NO NO GROUP DPM UKI-LT2-QMUL LT2 NO NO DATADISK DPM [10:11:08] Elena Korolkova left [10:13:39] Greig Cowan http://wn3.epcc.ed.ac.uk/srm/xml/srm_token_space [10:14:08] Elena Korolkova joined [10:20:15] Greig Cowan ewan, if you use my DPM monitoting, you might find this plot usefulll to add [10:20:19] Greig Cowan http://wn3.epcc.ed.ac.uk/srm/xml/srm_token_space_voview?token=.*&site=UKI-SOUTHGRID-OX-HEP [10:25:38] Greig Cowan similarly, for elena.... [10:25:53] Greig Cowan http://wn3.epcc.ed.ac.uk/srm/xml/srm_token_space_voview?token=.*&site=UKI-NORTHGRID-SHEF-HEP [10:26:11] Brian Davies ecdf does not look happy though [10:26:23] Brian Davies wrong emoticon [10:26:29] Winnie Lacesso Although it is stable for hours/days/weeks at a time. And then not. [10:27:08] Jens Jensen Oops audio problem? [10:27:39] Brian Davies can people here RAL [10:27:48] Winnie Lacesso I hear Jens only. [10:27:49] Brian Davies or each other? [10:28:41] Winnie Lacesso CMS filled up storage [10:28:56] Winnie Lacesso Then had to remove some of it, & asked GC for DPM "du" which he built!!!! [10:29:08] Winnie Lacesso CMS says THANKS!! [10:30:22] Jens Jensen Out of time now - thanks very much everyone [10:30:37] Jens Jensen Sorry to lose audio to many of you...? [10:30:42] Winnie Lacesso left [10:30:45] Jens Jensen See you next week [10:31:06] Brian Davies left [10:32:46] John Bland left [10:33:31] Elena Korolkova left [10:33:34] Ewan Mac Mahon Bye. [10:33:42] Matthew Doidge left