WLCG-OSG-EGEE Operations meeting

Europe/Zurich
28-R-15 (CERN conferencing service (joining details below))

28-R-15

CERN conferencing service (joining details below)

Nick Thackray
Description
grid-operations-meeting@cern.ch
Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
Attendees:
  • OSG operations team
  • EGEE operations team
  • EGEE ROC managers
  • WLCG coordination representatives
  • WLCG Tier-1 representatives
  • other site representatives (optional)
  • GGUS representatives
  • VO representatives
  • To dial in to the conference:
    a. Dial +41227676000
    b. Enter access code 0157610

    OR click HERE

    NB: Reports were not received in advance of the meeting from:

  • ROCs: All reports received.
  • VOs: CMS, LHCb, Alice
  • list of actions
    Minutes
      • 16:00 16:05
        Feedback on last meeting's minutes 5m
      • 16:01 16:30
        EGEE Items 29m
        • <big> Grid-Operator-on-Duty handover </big>
          From: ROC CERN / ROC UKI
          To: ROC Russia / ROC SE Europe


          NB: Please can the grid ops-on-duty teams submit their reports no later than 12:00 UTC (14:00 Swiss local time).

          Issues:
          1. Again GGUS#24591 TAU-LCG2, no update from site.
        • <big> Use of the 'DAG' repository </big>
          • We have a number of external dependencies which can be satisfied by using the 'DAG' repository which is a 3rd party repository containing lots of extras for SL4/RHEL4. It is even enabled by default on SLC4. The question is - is it acceptable to require administrators to point their installations to this?
        • <big> Future plans for LCG CE , gLite CE and CREAM CE</big>
            The EGEE PMB recently made the decision that the gLite CE is not of production quality and will be deprecated (in the near future). Therefore sites should put no effort into this service. Development effort will go into the CREAM CE and the LCG CE to make them production ready on SL4 with VDT 1.6.
            AMMENDMENT: There has been no decision by the EGEE PMB regarding the future of the CEs in the EGEE grid. The PMB had requested that the CREAM and gLite CEs were tested until the end of July in order that a decision can be made as to which CE will be developed further for production. This testing is now completed and the decision is pending. In the meantime, it should be noted that the version of the gLite CE in the repositories is not suitable for deployment in production (as stated in the release notes).
        • <big> Reminder: Support for gLite3.0 services is limited to critical and high priority patches </big>
          • This includes the SL4-complat WN.
        • <big>gLite Node Status for SL4 Release</big> 5m
          SA3 is now maintaining a simple report of node type against readiness for SL4. This has been requested in the past.
          https://twiki.cern.ch/twiki/bin/view/EGEE/Glite31NodeTracker
        • <big> Update on YAIM config tool </big>
          • Data schema changes on the CIC portal side are being carried on, and estimated deadline for this to be finished is 1-2 weeks.
          • In parallel, implied changes in the portal code (VO registration and VO ID card update forms) will be implemented within the same deadline.
          • These changes have been discussed and agreed, they also answer requirements from the VO managers group. See my mail on this topic for more details
          • AS the issues above are the most important, I've not started working with Dimitar on the integration work of the web part of the tool. This should start next week.
          • All in one, estimated deadline for the whole stuff to be operational and in production is in 2-3 weeks.
        • <big> Migration to SL4 WNs </big>
          • REMINDER THAT THE TIER-1 SITES ARE REQUESTED TO MIGRATE BY THE END OF AUGUST.
        • <big> PPS Report & Issues </big>
          PPS reports were not received from these ROCs:
          AP, IT, SEE, SWE

          Issues from EGEE ROCs:
          1. SUbmission of SAM tests form CYFRONET: There is some problem wit pps-wms-cern.ch, which affected SAM UI at Cyfronet. Ticket has been submitted: #25400. Currently Cyfronet SAM UI is configured to used glite-rb-01.cnaf.infn.it. Both PPS SAM UI are using the same WMS for now. This is temporary and will be change for better redundancy.CE ROC
        • <big> EGEE issues coming from ROC reports </big>
          1. Central Europe ROC: gLite for ia64 (Itanium) machines. Two sites in CE are using LCG-2.7.0 due to it is the last version of EGEE middleware running on ia64. What should they do when LCG-2.7.0 will be phased out.


          2. Central Europe ROC: GGUS ticket processing problem: https://gus.fzk.de/pages/ticket_details.php?ticket=24553 Ticket was assigned to wrong support unit, then solved by this support unit, reopened by the submitter 2 weeks ago and stuck in this state. Maybe we could include a kind of feedback information from submitters into GGUS ticket processing? This feedback could be used to assess the quality of the support process.
      • 16:30 17:00
        WLCG Items 30m
        • <big> Tier 1 reports </big>
        • <big> WLCG issues coming from ROC reports </big>
          1. Italian ROC: FTS2 MIGRATION: We have to schedule the migration to FTS2. August is not the best month for this kind of operation, the CNAF computing center will be also down on 28th of August for urgent work of maintenance at the cooling system. We'd like to know the plan of the LHC VOs activities for August/September and if they absolutely need FTS2 before the second week of September.


        • <big>WLCG Service Interventions (with dates / times where known) </big>
          Link to CIC Portal (broadcasts/news), scheduled downtimes (GOCDB) and CERN IT Status Board

          See also this weekly summary of past / upcoming interventions at WLCG Tier0 and Tier1 sites (extracted manually from EGEE broadcasts and other sources).

          Time at WLCG T0 and T1 sites.

          1. SCHEDULED DOWN AT CNAF COMPUTING CENTER: CNAF computing center will be down on 28th of August for urgent work of maintenance at the cooling system. A detailed plan of the down will be send in the next few days.
        • <big>FTS service review</big>
            Please read the report linked to the agenda.
          Speaker: Gavin McCance (CERN)
          more information
        • <big> ATLAS service </big>
          See also https://twiki.cern.ch/twiki/bin/view/Atlas/TierZero20071 and https://twiki.cern.ch/twiki/bin/view/Atlas/ComputingOperations for more information.

          • Last week Atlas reported a number of US T2 sites not present in the FTS which was attributed to them not being present in the WLCG BDII. Contact with the OSG GOC resulted in a statement on Friday from the GOC of: ATLAS is not required to advertise to the WLCG BDII. He asked that the MWT2 sites not report. I agree the file that Steve references below is stale, I will bring this up at the ITB meeting this afternoon. OSG GOC. http://vors.grid.iu.edu/cgi-bin/show_ldap_info.cgi.
          • ATLAS production requires uuencode/uudecode and bc from sharutils rpm on theWN's. On SLC3 they were installed by default, on SLC4 not anymore. This requirement is already registered in the CIC portal under ATLAS OtherRequirements.
          Speaker: Kors Bos (CERN / NIKHEF)
        • <big>CMS service</big>
          • No report.
          Speaker: Mr Daniele Bonacorsi (CNAF-INFN BOLOGNA, ITALY)
        • <big> LHCb service </big>
          • No report.
          Speaker: Dr roberto santinelli (CERN/IT/GD)
        • <big> ALICE service </big>
          • No report.
          Speaker: Dr Patricia Mendez Lorenzo (CERN IT/GD)
        • <big> Service Challenge Coordination </big>
          Speaker: Harry Renshall / Jamie Shiers
      • 16:55 17:00
        OSG Items 5m
        1. Item 1
      • 17:00 17:05
        Review of action items 5m
        more information
      • 17:10 17:15
        AOB 5m