Indico celebrates its 20th anniversary! Check our blog post for more information!

Deployment team

Europe/Zurich
EVO - GridPP Deployment team meeting

EVO - GridPP Deployment team meeting

Jeremy Coles
Description
- This is the weekly DTEAM meeting - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area
Minutes
    • 11:00 11:15
      Experiment problems/issues 15m
      Review of weekly issues by experiment/VO - LHCb - CMS - ATLAS - Other
    • 11:15 11:20
      Infrastructure changes & problems 5m
      "Following discussion with Steve, and PMB where John wanted to know why the latest CE at RAL wasn't being used in monitoring, we suggest that we need to have regular PMB update (from yourself) on CE, RB and SRM changes during the week - most recent example was the CE change at RAL but there will be others... Could you discuss/raise this as a standing item at DTeam meetings - sites must inform you if these locations change. This is independent of those services going down. Info. should flow to Steve and John, at least, for the SAM/accounting monitoring.." OR is this just introducing a manual mechanism to cover a lack of flexibility in the current grid middleware? How can we do this better?? Accounting: Oxford; Glasgow; IC-LeSC; UCL-Central? ATLAS tests: Made more sense yesterday! SL SAM (ops) summary: UCL-HEP; Manchester; Oxford SAM (ops) recent: Errors - Oxford, Manchester and RHUL.
    • 11:20 11:35
      ROC update 15m
      EGEE07 ********* Graeme put a quick summary in the ScotGrid blog: http://scotgrid.blogspot.com/ Mingchao has information on the security discussions... Things to note/discuss: - YAIM 4 is coming. Discussed in contect of support (http://indico.cern.ch/sessionDisplay.py?sessionId=13&slotId=0&confId=18714#2007-10-02) - gLite WMS is out in production. - Lots of views exchanged on source rpms! Information given about the build process for gLite (look at http://indico.cern.ch/sessionDisplay.py?sessionId=8&slotId=0&confId=18714#2007-10-01) - No decision made about the future direction(s) of the PPS but it has been shown to be catching bugs. - A proposal for an EGI is built upon with use cases and analysis taking place over the next 4-6 months. Malcolm Atkinson is the UK representative. - ISSeG training and OSCT training (any feedback?) - Site monitoring - talk on prototype grid service monitoring and new GridMaps. Views on Nagios. - Progress on SLAs discussed (actually SLDs) - New tools for VO managers and issues they face - SA1&JRA1 meeting - things to improve inc. load balancing, wms monitoring, logging, uniformity of interfaces and admin management (comments from Alessandra?). - Some details about the CREAM CE Ticket status *************** EGEE ops meeting items of interest ****************************************** * gLite 3.0.2 PPS Update 40 released to PPS. This release contains: * R-GMA fixes (Bug #17323) * APEL Update (glite-apel_R_2_0_17) * YAIM 4.0.0 for the 3.0 repository * lcg-vomscerts-4.6.0 adds cert for US-ATLAS server (Synch to production) * Addition of lcg-version to WN and UI * Fix to avoid LB client crash when unknown events are returned by server * Re-branded GIP that includes improved LDIF parsing *Missing RPM for stand-alone LB: The glite-lb-client RPM is not installed by the glite-LB metapackage, although it IS installed by the glite-WMSLB metapackage. Therefore, any sites running a stand-alone LB, when they apply the latest update, should manually download and install the glite-lb-client RPM from the gLite repository. This package contains the glite-lb-purge.cron cron job used to make some cleanup in the MySQL database. * SAM Unavailability: from 02.10.2007 16:30 to 03.10.2007 12:00 - Reason: database problem. Issue understood and resolved.[ROC Cern]. This will have an impact on availability if a site was failing just before the outage. * France asked about any plans to stop using grid-mapfile on grid nodes. None available. * Removal of WMS Network Server: glite-job-submit no longer a valid command! Many months ago the EGEE TCG decided that the Network Server would be obsoleted in the gLite 3.1 WMS. A consequence of this is that the glite-job-submit commands no longer work. Instead, glite-wms-job-submit should be used. *PIC reported: We have corrected as sugested by Atlas the information published in the VOView. The problem was was that the dynamic-scheduler was being configured to map special groups to FQANs, while publishing of these FQANs was turned off in the machine. As reported by Jeff Templon, the special groups have been configured to be invisible. We have turned them back on, by configuring the dynamic scheduler (by hand) to map all VO special groups to the generic VO. Some UK sites still apper as problematic: http://voatlas01.cern.ch/atlas/data/VOViewProblem.log
    • 11:35 11:45
      What should data management & site testing focus on now? 10m
      - Graeme led the early work (http://www.gridpp.ac.uk/wiki/Service_Challenge_Transfer_Tests) which has been taken over by Andrew. - What would be useful for the sites in this area?
    • 11:45 11:55
      Preparing for the HEPSYSMAN meeting / monitoring workshop 10m
      - Ideas for the agenda - Current planning - Registration?
    • 11:55 12:05
      glexec and pilot jobs 10m
      - Our position ahead of the GDB discussion - https://edms.cern.ch/document/855383/1 Note there are some other policy documents going through: - Site Operations Policy https://edms.cern.ch/document/726129 - Grid Security Policy (now approved) https://edms.cern.ch/document/428008/4 - VO Operations Policy (GDB tomorrow) https://edms.cern.ch/document/853968/1 JSPG meetings are here: http://indico.cern.ch/categoryDisplay.py?categId=68 JSPG website is here: http://proj-lcg-security.web.cern.ch/ Latest policy docs links: http://proj-lcg-security.web.cern.ch/proj-lcg-security/documents.html
    • 12:05 12:10
      AOB 5m
      - VO specific SAM tests: https://twiki.cern.ch/twiki/bin/view/LCG/SAMVOSpecificTests - New view of resources: http://gridmap.cern.ch/gm/ - Tier-2 quarterly reports are due *this* week - There is a pre-GDB today (http://indico.cern.ch/conferenceDisplay.py?confId=9810) and a GDB on Wednesday (http://indico.cern.ch/conferenceDisplay.py?confId=8488) - Actions http://www.gridpp.ac.uk/wiki/Deployment_Team_Action_items