Deployment team

Europe/Zurich
EVO - GridPP Deployment team meeting

EVO - GridPP Deployment team meeting

Jeremy Coles
Description
- This is the weekly DTEAM meeting - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area. The phone bridge number is +41 22 76 71400, the meeting ID is 124243 and the passcode is 4880.
    • 1
      Experiment problems/issues
      Review of weekly issues by experiment/VO - LHCb - CMS - ATLAS - Other -- supernemo are still not enabled at very many sites. This was discussed at yesterday's PMB. The PMB would like to strongly encourage sites to support the work of supernemo - please could the T2Cs pass this message on to all sites.
    • 2
      Updates
      ROC manager update ************************* The meeting on 27th November was cancelled. The next meeting is on 11th December - only the standing items are currently on the agenda (http://indico.cern.ch/conferenceDisplay.py?confId=23751). Do we have anything to put forward? Ticket status *************** There are several urgent tickets waiting: Ticket: 2205 Priority: urgent Status: in progress Title: LFC failure on gw-1.ccc.ucl.ac.uk (UKI-LT2-UCL-CENTRAL) Ticket: 2206 Priority: urgent Status: in progress Title: LFC failure on pc91.hep.ucl.ac.uk (UKI-LT2-UCL-HEP) Ticket: 2229 Priority: urgent Status: waiting for reply Title: Problems retrieving atlas data from UKI-LT2-UCL-CENTRAL Ticket: 2241 Priority: urgent Status: in progress Title: Ancient LFC plugin needs upgraded at UKI-LT2-QMUL Ops meeting update ********************* Yesterday's meeting (http://indico.cern.ch/conferenceDisplay.py?confId=23801) failed to go ahead due to the telephone system. Points to note: - gLite3.1.0-PPS-UPDATE10 was released to PPS. Results of the pre-deployment tests are here: http:www.cern.ch/pps/index.php?dir=./release/testreports/. - release of gLite3.1 Update07 expected in production sometime this week and includes: # jobWrapper tests - new version with no R-GMA dependencies # glite-VOMS_mysql metapackage for gLite 3.1 and SL(C)4 # glite-VOMS_oracle metapackage for gLite 3.1 and SL(C)4 # Bug fixes for UI and WN - Looking for a user community interested to try out the newly released postgres-based version of AMGA and a site to do pre-deployment tests of AMGA. Anyone interested? - Some SGE accounting issues have arisen in Germany and may be of relevance: https://gus.fzk.de/ws/ticket_info.php?ticket=29426 https://gus.fzk.de/ws/ticket_info.php?ticket=29550 - Of interest... "CERN-PROD: Submission storm due to WMS bug. affecting CMS. This started on Tuesday evening went on until Thursday evening, and overloaded both the batch system and the CEs hosting the jobs. Due to this CERN hosted more than 30k GRID jobs for quite some time, and we passed a limit on the maximum number of jobs allowed in the batch system. This limit was increased from 50k to 75k to allow new submissions." - [Russia] It seems like some users try to submit jobs to the sites bypassing RB/WMS system, directly using CE job submission APIs or globus tools. What should we do with this (i.e.: don't care, encourage, prohibit in some way)? - BNL-LCG2 saw several problems with Panda monitoring machines crashing
    • 3
      Site review
      - What's happening with the RBs at RAL? Upgrade?? http://hepwww.ph.qmul.ac.uk/~lloyd/gridpp/rbtest.html. Phenogrid have recently complained about the RAL RBs vs Glasgow's - what are the main difference factors (load?). - The ATLAS tests today look rather RED due to a test problem no doubt. - SAM monthly: The monthly average figures for November are now on the website (http://www.gridpp.ac.uk/wiki/SAM_availability:_Monthly_summary_table). Generally this shows on going improvement and the PMB wanted to congratulate the DTEAM for this result. - SAM recent: Sheffield has improved. Cambridge was stable but now appears to osciallate. Several LT2 sites struggling - UCL (HEP & CENTRAL); QMUL & RHUL.
    • 4
      Security
      - Incident procedure -- Request sent to DTEAM but only 1 reply -- http://www.gridpp.ac.uk/deployment/security/inchand/index.html - Some feedback on parts of the Service Availability Workshop (described by Jamie as one of the best workshops so far) and general discussions at CERN last week.
    • 5
      Team activities
      - Opportunity to let everyone know what is happening in your area!
    • 6
      Actions review
    • 7
      AOB
      - There is a pre-GDB today: http://indico.cern.ch/conferenceDisplay.py?confId=20248 - There is a GDB at CERN tomorrow: http://indico.cern.ch/conferenceDisplay.py?confId=8508.