Indico celebrates its 20th anniversary! Check our blog post for more information!

Deployment team

Europe/Zurich
EVO - GridPP Deployment team meeting

EVO - GridPP Deployment team meeting

Jeremy Coles
Description
- This is the weekly DTEAM meeting - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area. - The phone bridge number is +41 22 76 71400. The phone bridge ID is 353540 with code: 4880.
Minutes
    • 11:00 11:20
      Experiment problems/issues 20m
      Review of weekly issues by experiment/VO - LHCb - CMS - ATLAS - Other -- UKQCD (following up with Steve T on resource accounting) -- File persistency discussion
    • 11:20 11:35
      COD work 15m
      - Outcomes of Lyon meeting (review of meeting content and discussion): http://indico.cern.ch/conferenceDisplay.py?confId=34516 - Training (what is needed/available) - Tier-2 plans to take on this task - Grid Ireland involvement - Timetable for task engagement
    • 11:35 11:45
      ROC update 10m
      ops update *************** - A new dedicated instance of the SAM database for the PPS has been setup (fixes emerging incompatibiliity) - CREAM CEs now at CNAF and FZK PPS sites. Publishing these into the production BDII and involving the experiments starts from today. [more information and links https://twiki.cern.ch/twiki/bin/view/LCG/OpsMeetingPps] - PPS (3.1) update 31 available soon. It includes glexec installation and configuration patches. 3.0 update 50 went to PPS on 23rd June and conatins a gLite-FTA fix. - Do we have experience with multi-valued LCG_GFAL-INFORSYS? (CE) Suggestion for extending the SAM RM test timeout. - CE noted that GGUS:37754 indicates that SE downtime is correctly handled by SAM but not in GridView visualisation. - DECH Do generic linux or torque/maui configurations or tools exist to prevent use of site resources for DOS (e.g. memory fork bomb on gLite-WN. - IN2P3? suffered from >30,000 ATLAS jobs hitting site. Unclear if internal submission and whether jobs were sucked in or targetted. [Aside: Job load spikes seem to becoming a regular Grid feature!] - T1s have now deployed FTM - IN2P3 gsidcap file access problem (GGUS:36625) was traced to scrw up in the global GSI environment with multiple connections to the same gsidcap door. - Reminder of baseline storage versions https://twiki.cern.ch/twiki/bin/view/LCG/GSSDCCRCBaseVersions. - EGI workshop taking place at CERN: http://www.eu-egi.eu/workshop/jun08 WLCG update ***************** There is a GDB next week: http://indico.cern.ch/conferenceDisplay.py?confId=20231. Ticket status *************** https://gus.fzk.de/download/escalationreports/roc/html/20080630_EscalationReport_ROCs.html
    • 11:45 11:55
      Quarterly reports 10m
      - The "new" template. - Reports are expected within 2 weeks of the end of the quarter - Reports now include metrics for completion. - The resource figures (including disk usage must be completed since we still do not trust the published information [can we get the history now?])
      Q108 T2 resource delivery
      Quarterly report template
    • 11:55 12:05
      Actions review 10m
    • 12:05 12:10
      Topics to revisit 5m
      - gstat publishing. Small group being formed. - Wiki/web page updates (see for example http://www.gridpp.ac.uk/deployment/contact.html). Admin task! - Completion of the GridPP-NGS site status information in http://www.gridpp.ac.uk/wiki/Working_with_NGS - Regional Nagios monitoring (ScotGrid have progressed - who else is moving forward with it?) - Collecting site queue/fairshare information - Reminder for sites to add comments to http://www.gridpp.ac.uk/wiki/SAM_availability:_October_2007_-_May_2008. - Look at the Site Readiness Review reports - Comment on the EGEE SLDs (Feedback so far from SouthGrid and NorthGrid. London David was happy but what about others in the T2?). - "We need to audit T2 sites to understand how many concurrent transfers each can cope. This requires details of how many servers are available and how the pools are allocated between the VOs." - 080630: The first public version of the Operations Automation Strategy (MSA1.1) is now in EDMS at https://edms.cern.ch/document/927171/1
    • 12:10 12:15
      AOB 5m