EGEE Operations meeting

28-R-15 (CERN conferencing service (joining details below))


CERN conferencing service (joining details below)

antonio retico
Weekly EGEE infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
  • EGEE operations team
  • EGEE ROC managers
  • site representatives (optional)
  • GGUS representatives
  • VO representatives
  • To dial in to the conference:
    a. Dial +41227676000
    b. Enter access code 0148141

    AND click HERE
    (Please specify your name & affiliation in the web-interface)

    Click here for minutes of all meetings

    Click here for the List of Actions

      • 4:00 PM 4:20 PM
        EGEE Items 20m
        • <big>Central Grid-Operator-on-Duty (c-COD) handover</big>
          From Central Europe to Northern Europe
          Handover Log:
          RODs are highly unresponsive to C-COD request. Even after third reminder there was no reply. Unresponsive RODs: Canada, Cern, Asia Pacific
          CERN apologises for unresponsiveness: Reported tickets are now being handed

          ROD CANADA
          two tickets not handled:
          • CA-SCINET-T2 - GGUS 54764 Expiration date 2010-01-21
          • CA-ALBERTA-WESTGRID-T2 - GGUS 54707 Expiration date 2010-01-19
          Four reminders sent. No reply from ROD.
        • <big> Pilot Services Report & Issues </big>
          Info about active pilot services at:

        • <big> gLite Release News</big>
        • <big> EGEE issues coming from ROC reports </big>

          No major issues raised by any ROCs this week

          (6/14) ROCs hadn't submitted the report at 2:40

        • fixing MPI sites (from the MPI WG) 15m
          Update received from Isabel Campos (MPI Task Force) here summarised.
          Last week (4th Feb) over the 2/3 limit of MPI sites passing the SAM tests.
          Today the situation is confirmed (67 out of 94 CEs passing the SAM tests)
          Isabel suggested to wait another week to get over the 2/3 in a more stable way, and have some graphics about the historical data from SAM to show the stability of the service.
          Documentation for MPI Support in EGEE:

          More information about errors in the MPI knowledge DB:

        • Instances of out of date services in the grid 15m
          Attached you can find a list of instances of services that are “out-of-date” according to the “list of supported service versions" wiki page.
          This list will be published every fourth week and the ROCs will be given one week to react (either to upgrade the services at the sites or to explain why not).
          From the current list please disregard instances reported running version . That's a known issue in the info provider currently being fixed
          Next list will be published by the 22nd of February 2010.
          List of out-of-date instances
      • 4:30 PM 4:35 PM
        Review of Action Items 5m
      • 4:35 PM 4:40 PM
        AOB 5m