WLCG-OSG-EGEE Operations meeting

28-R-15 (CERN conferencing service (joining details below))


CERN conferencing service (joining details below)

John Shade (CERN)
Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
  • OSG operations team
  • EGEE operations team
  • EGEE ROC managers
  • WLCG coordination representatives
  • WLCG Tier-1 representatives
  • other site representatives (optional)
  • GGUS representatives
  • VO representatives
  • To dial in to the conference:
    a. Dial +41227676000
    b. Enter access code 0148141

    AND click HERE
    (Please specify your name & affiliation in the web-interface)

    Click here for minutes of all meetings

    Click here for the List of Actions

      • 4:00 PM 4:20 PM
        EGEE Items 20m
        • <big>Central Grid-Operator-on-Duty (c-COD) handover</big>
          From France to Central Europe
          Handover Log:
          1 ticket (GGUS #51458) for ROC_SE is opened for more than 1 month. I have sent a reminder to ROC_SE about it to check if the problem can be solved.

          ROC_AP has 2 APEL tickets opened for more than 1 month. Work is in progress for MY_MIMOS-GC-01. For TW-FTT, ROC_AP sent a reminder and will escalate to last step if no answer.

          Question about APEL test: we cannot put a site out of production with APEL problem not solved for more than 1 month. How should we handle such tickets?

        • <big> PPS Report & Issues </big>
          Please find Issues from EGEE ROCs and general info in:
        • <big> gLite Release News</big>
        • <big> EGEE issues coming from ROC reports </big>
          Only Asia-Pacific hadn't validated their reports by 14:00 (time zone problem?). Thanks for the recent improvements in this area, even though the reports remain very quiet!

          From DECH:

          • FZL [INFO]: On the 14th of October DE-KIT will run a test of the LHCOPN backuplink infrastructure. We expect this intervention to be completely transparent. The execution of the link test will start at 9:00am (CEST).
          • FZK [AT RISK]: Planned intervention AT RISK: 20-10-2009 8:00 - 10:00 UTC Due to the application of an Oracle patch, GridKa/DE-KIT s LHCb 3D/LFC database is at risk.

          From ROC_LA (Latin America):

          • Thanks from Renato Santana to everybody who helped with the creation of ROC_LA (although it seems to be from last week).
        • <big>Grid Service Interventions </big>
          Link to CIC Portal (broadcasts/news), scheduled downtimes (GOCDB) and CERN IT Status Board
          Please consult the URLs above for details.

        • <big>Miscellaneous</big> 15m
          • Ratification of simplified Intervention Procedures https://edms.cern.ch/document/1032984
          • Sites that have a reliability of less than 50% during three consecutive months will be suspended, and will have to go through the certification process again.
          • Downtimes longer than one month should be exceptional and be approved beforehand by the corresponding ROC and this body notified.
          • Reminder for sites to move to WMS 3.2 (available in gLite repository). This must be done by the end of October!
      • 4:30 PM 4:35 PM
        Review of Action Items 5m
      • 4:35 PM 4:40 PM
        AOB 5m