lecture WLCG-OSG-EGEE Operations meeting
Date/Time: Thursday, 5 October 2006 - 16:00 (Europe/Zurich)
Location: VRVS (plane room) ( 28-R-15 )
Chairperson:
Description: grid-operations-meeting@cern.ch
Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
Attendees:
  • OSG operations team
  • EGEE operations team
  • EGEE ROC managers
  • WLCG coordination representatives
  • WLCG Tier-1 representatives
  • other site representatives (optional)
  • GGUS representatives
  • VO representatives
  • VRVS "plane" room will be available 15:30 until 18:00 CET

    Material: actionlist link minutes link
    Thursday, 5 October 2006 16:00 ->17:25 WLCG-OSG-EGEE Operations Meeting (28-R-15)    

     
     Thursday, 5 October 2006
    WLCG-OSG-EGEE Operations Meeting (16:00 ->17:25 )
    Chairperson:
     16:00
    Feedback on last meeting's minutes (5')   Minutes link    
     16:05
    EGEE Items (20')    
    • Grid-Operator-on-Duty handover (5')
      From Russia ROC (backup: Italy) to CERN ROC (backup: DECH ROC)
      Tickets:
      Open 55
      Closed 30
      2-mail 13
      Modified 48
      All 146

      Notes:
      1. No information on SFT PPS was enabling.
      2. The dashboard was very unstable and did not refresh since Fri, 29 Sep 2006 16:16:19 +0200 till now.
     
    • Job priorities WG (10') transparencies powerpoint file  
      Summary of the the Job Priorities WG recommendations and deployment plans
    Jeff Templon, Dietrich Liko  
    • Move to the new version of FCR (5')
      In migration to the new version of FCR the VOs should be reminded to apply their settings on the new version, as the old one will be phased out by 6th October, 2006. This is especially important because of the 'dteam' => 'ops' change. (Currently most VOs don't have a Critical Test set defined for 'ops').
    • VOs need to check that their settings in the new FCR tool are correct
    • Owners of top-level BDIIs which use FCR need to use the new LDIFF
    Judit Novak  
     
    • summary on the status of the request to allow users to pass arguments to the underlying LRMS (5')
    Alessandra Forti  
    • Savannah bugs to follow up (5')
    • bugs 17738 and 15746 (both GFAL): work will start in around 4 weeks time. Delay is because SRM 2.2 work has to be completed first. Need to give feedback if this is too long (with justification)
      bug #17738: GFAL info system timeout too low
      bug #15746: GFAL should optimize LDAP queries

    • bug 19878: work is currently due to start at beginning of December. Feedback should be given if this is too long (with justification)
      bug #15878: DNs with "." are not properly handled
     
    • EGEE issues coming from ROC reports (15')
      Reports were not received from these ROCs: AP, SWE, UKI

      1. Item 1 (NE ROC): A major concern for the Netherlands is the possible drop of support for VOMS-enabled Pre-WS GRAM on the gLite-CE. A number of the VOs that we support use Nimrod to submit jobs which works on Pre-WS (VOMS-enabled) GRAM. At least as long as Globus packages are in their toolkit. Also see remarks made for SARA-MATRIX site.


      2. Item 2 (SEE ROC): 1) AEGIS Yet again non-official and invalid SFT sent to our site by Rafal Lichwala from SFT Admin Tool on 27-09-2006 10:51 is present in our CIC daily report. While we don't mind having any regular jobs sent to our site through supported VOs, CIC daily report should not contain such SFT failure. To make the matters worse, this SFT failure is triplicated.


      3. Item 3 (Italy ROC): The errors on the SFT tests for this day - marked as critical (CT) - we were not able to reproduce for a 'dteam' user. How would be the procedure to test as 'ops'? Should we ask to become member of the 'ops' VO?
        (UKI ROC) The problem of the OPS failure with 3rd party replication is being investigated. It seems this is a very limited problem, affecting only this VO and only lxn1183.cern.ch as a remote SE. As the site has no one in the OPS VO to aid with testing it's very hard to debug this. We suggest that at least one support person in each ROC be a member of the OPS VO to help sites with problems like this.


     
     16:25
    OSG Items (5')    
    • Item 1 (5')
     
     16:30
    WLCG Items (35')    
    Harry Renshall  
    • WLCG related Issues coming from experiment VOs and Tier-1/Tier-2 reports (15') more information unknown type file  
      Reports were not received from:
      > Tier-1 sites: Taiwan, FNAL, PIC, TRIUMF, BNL, Sara/NIKHEF
      > VOs: NO VO submitted a report

     
     17:05
    Review of action items (15')   actionlist link    
     17:20
    AOB (5')    
    • Item 1: change of day of operations meeting, back to Mondays at 16:00 (5')