28-R-15 (CERN conferencing service (joining details below))
CERN conferencing service (joining details below)
firstname.lastname@example.org Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
OSG operations team
EGEE operations team
EGEE ROC managers
WLCG coordination representatives
WLCG Tier-1 representatives
other site representatives (optional)
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0148141
OR click HERE (Please specify your name & affiliation in the web-interface)
Report from Italy: List of unresponsive sites (First Ops meeting):
SITE NAME: ru-Moscow-GCRAS-LCG2
ROC NAME: ROC_Russia
GGUS TICKET NUMBER: #45457, #45039, #44739
Reason for escalation: as reported at https://twiki.cern.ch/twiki/bin/view/EGEE/EGEEROperationalProcedures#7_6_Suspending_a_site, the ROC must suspend one of their sites if a site is in downtime for more than one month. The site is almost in SD (at risk) from 2008-12-24 (https://goc.gridops.org/downtime/list?id=15555346) and the last SD ends on 2009-04-16 (https://goc.gridops.org/downtime/list?id=20105380).
Also, GGUS tickets are also not updated in timely manner.
Problems Encounteredduring shift:
Some temporary problems with cod dashboard on 24th of March due to the GOCDB outage.
Because of GGUS update on 25th of March, cod dasboard has been unavailable for a couple of hours.
gLite 3.1 Update 42 was released to production in preparation:
BDII: The starting cache size has been reduced from 1 GB to 50 MB.
VDT 1.6.1 Release 9 - This version features the fix of a bug in globus that was causing troubles to 32bit programs using globus and running on 64bit machines.
gLite3.1 lcg-vomscerts-5.4.0 adds next cert for lcg-voms.cern.ch
Now in PPS
Nothing to report.
Soon in Production
Release of gLite 3.1 Update 43 to production in preparation (approx. 6 April):
YAIM clients: to enable configuration of Service Discovery
VOMS: fixes for FQAN order, short FQANs....64bit version. Also dependency on mysql-server added.
SGE: New info dynamic plugin + YAIM utils
Release of gLite 3.1 Update 44 to production in preparation (approx. 14 April):
CREAM CE: Updates to CE + YAIM
WMS: Update to ICE + YAIM
This set of patch includes the versions tried out during several weeks in a PPS Pilot and it is known to fix a number of performance issues previously affecting the ICE --> CREAM submission chain.
<big> EGEE issues coming from ROC reports </big>
None this week.
<big>Grid Service Interventions </big>
SARA: OUTAGE: From 02:00 4 April to 02:00 5 April. Service: dCache SE. SARA: OUTAGE: From 09:30 30 March to 21:00 30 March. Service: srm.grid.sara.nl. SARA: OUTAGE: From 15:13 27 March to 02:00 31 March. Service: celisa.grid.sara.nl. Fileserver malfunction. CERN: At Risk: From 11:00 31 March to 12:00 31 March. Service: VOMS (lcg-voms.cern.ch). FZK: OUTAGE: From 14:21 30 March to 20:00 30 March. Service: fts-fzk.gridka.de INFN-CNAF: OUTAGE: From 02:00 28 March to 19:00 3 April. Service: ENTIRE SITE. INFN-T1: OUTAGE: From 16:00 27 March to 17:00 3 April. Service: ENTIRE SITE. NDGF-T1: At risk: From 12:31 27 March to 16:31 30 March. Service: srm.ndgf.org (ATLAS). NDGF-T1: At risk: From 12:31 27 March to 13:27 31 March. Service: ce01.titan.uio.no.
As previously announced, it is planned that all remaining gLite 3.0 services will be retired by the end of April. At this point, all support for these services will cease.
All sites should ensure that they are running up-to-date versions of their services. If any site sees a need to keep a gLite 3.0 service in the middleware stack, please submit a GGUS ticket as soon as possible.
<big> Removal of the WLCG-specific section of the meeting</big>
From now on, at the request of WLCG, there will be no WLCG-specific section at this meeting. Note that the WLCG experiments will still take part to the general meeting.
(OSG - Indiana University)
Discussion of open tickets for OSG
Exactly this was discussed for the last 2 weeks and Rob had an action to check.
#46647: The ticket is now assigned to Rob. The action required is in the 2009-03-24
Comment today in stalled urgent ATLAS ticket since 2009-03-09 GGUS
Tim and other OSG colleagues,
my understanding from https://savannah.cern.ch/support/index.php?107511#comment3 is that
had you chosen status 'customer' in OIM,
the ggus ticket would have gone to status 'waiting for reply' and the submitter would have been prompted
to react. Please do so now.
#47032: should have been in status 'solved'. Assigned to GGUS dev. for investigation.
#47061: Same as above. It should have been marked 'solved'.
Review of action items5m
This week: LHC experiment VOs to perform an ALARM ticket test (full round from opening to ticket closing) to Tier1s.
[savannah ticket #107452] and
[testing rules]. Summary reports must be sent to email@example.com by April 3rd at the latest! (MariaDZ)