Deployment team

Europe/London
EVO - GridPP Deployment team meeting

EVO - GridPP Deployment team meeting

Jeremy Coles
Description
- This is the weekly DTEAM meeting - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area. - The phone bridge number is +44 (0)161 306 6802 (CERN number +41 22 76 71400). The phone bridge ID is 44709 with code: 4880.
Minutes
    • 11:00 11:20
      Experiment problems/issues 20m
      Review of weekly issues by experiment/VO - LHCb Dear site managers, Just to remind you that it has been agreed some time ago that the SRM name of the xrootd protocol would be xroot (and not root as wrongly used at some sites). Thanks for fixing it in case your site doesn't comply with this rule. In absence of a standard protocol name, it is very difficult to make tests (note that "root" on Castor is the name for the rootd protocol and not xrootd ;-) Thanks! Cheers, Philippe to support what Philippe is asking, please check pag. 32 of the following document (March 2009): https://twiki.cern.ch/twiki/pub/LCG/WLCGCommonComputingReadinessChallenges/WLCG_GlueSchemaUsage-1.8.pdf Please note that the control *endpoint* for an xroot storage system can be of the form: [x]root://[user@]host[:port][,host[:port]…]/ as specified on page 31. However, the protocol type must be "xroot" in the case of an xroot storage system. Flavia - CMS - ATLAS - Other -- camont: I'm going to be submitting Grid jobs to process 10k+ PDF files for our iLexIR friends over the next week or two. I don't think that the data-transfer activity will be very high, but if it ever reaches a level that causes problems please let me know.
    • 11:20 11:30
      ROC update 10m
      ROC update *************** - Update from on-duty -- RHUL suspended while in extended downtime -- Testing of Nagios based dashboard From the EGEE ops meeting: http://indico.cern.ch/conferenceDisplay.py?confId=86070 - Main thing to note is switch to Nagios. From the site reports: These have been stopped! Tier-1 update *************** WLCG update ***************** - Next GDB is in Amsterdam on 24th March: http://indico.cern.ch/conferenceDisplay.py?confId=84636. Need to agree schedule: http://www.gridpp.ac.uk/wiki/GDB_reports Ticket status *************** https://gus.fzk.de/download/escalationreports/roc/html/20100222_EscalationReport_ROCs.html 50491 - on hold. CMS transfers IC-RHUL. Probably jumbo frames issue. Opened in July 09*******. 53349 - on hold. Bristol. Publishing vast amount of storage. Opened in November*****. 53598 - ATLAS T1. On hold. Channel load change request. wait for data to test?*** 53834 - on hold. ECDF old CE. Waiting on second (new) CE?*
    • 11:30 11:40
      APEL status check 10m
      - APEL is back "After our integrity checks, we can guarantee that 99.8% of the data have been restored, and we have no reason to think that the remaining 0.2% have been lost. We however advise all sites to check their data on the accounting portal and report any inconsistencies through a GGUS ticket." Updates can be tracked via http://goc.grid.sinica.edu.tw/gocwiki/ApelIssues-Jan_Feb_2010 For GridPP sites:http://www3.egee.cesga.es/gridsite/accounting/CESGA/egee_view.php RAL Tier-1 has a gap in the last week of January, Oxford and Cambridge that week plus the start of February and Manchester misses a large section and Sheffield, Durham and ECDF seem to have stopped in January.
    • 11:40 11:48
      Priorities & availability 8m
      - Summary of AOD discussion (plans) - https://twiki.cern.ch/twiki/bin/viewfile/LCG/SamMbReports?filename=Tier 2_Reliab_201001.pdf The (reliability:availability) figures are: London (96%:95%); NorthGrid (91%:91%); ScotGrid (97%:97%) and SouthGrid (92%: 72%). The SouthGrid figures are due to Oxford cooling problems (late December until mid-January) and RAL-PPD which has had several outages due to building power supply interventions and a reconfiguration of dCache nodes.
    • 11:48 11:58
      CREAM/SCAS/glexec status 10m
      -
    • 11:58 12:03
      Actions 5m
      See http://www.gridpp.ac.uk/wiki/Deployment_Team_Action_items
    • 12:03 12:04
      AOB 1m
      - Security - Topics for GridPP24