28-R-15 (CERN conferencing service (joining details below))
CERN conferencing service (joining details below)
firstname.lastname@example.org Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
OSG operations team
EGEE operations team
EGEE ROC managers
WLCG coordination representatives
WLCG Tier-1 representatives
other site representatives (optional)
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0140768
OR click HERE (Please specify your name & affiliation in the web-interface)
UKI: No data available in ROC (or site) report(s) for the failures from SAM framework section.
<big> gLite 3.1 update 33, BDII</big>10m
Details on the changes of gLite 3.1 update 33 for the BDII
the status of gLite 3.1 Update 33 is as follows:
The glite-BDII (top-level BDII) meta-rpm for Update 33 was removed
on Friday. At the same time the previous meta-rpm was changed to
require exactly the previous version (3.9.1-5) of the bdii rpm.
Sites that already upgraded their top-level BDIIs before these
changes may want to downgrade (but see below).
Resource and site BDIIs were not seen to display the instabilities
described in Savannah bug #42727, therefore the meta-rpms for other
node types have not been changed.
The top-level BDII instability is being looked into with high priority.
The "chown" problem reported by Michel Jouvin does not affect sites
that use YAIM for their configurations. A fix for this problem has
been coded and a new bdii version is being certified. It is expected
to be released to the production system this week.
<big>gLite 3.0 services to be obsoleted</big>5m
An announcement for this retirement is already on the gLite 3.0 page :
This corresponds to the procedure (until we have new one) that was discussed in the ops meeting in Feb 08: https://twiki.cern.ch/twiki/bin/view/EGEE/WlcgOsgEgeeOpsMinutes2008x02x25#Support_for_gLite_3_0_services
PLEASE, LET US KNOW ANY OBJECTION BY NEXT WEEK!
<big> Proposed process for removing SA1 support for old gLite services
Attaches is a proposed process for removing support from obsolete glite services and out-of-date versions of services. Please read and comment as soon as possible.
<big> WLCG issues coming from ROC reports </big>
France: TEAM/ALARM tickets for T1s: how LHC expirements make their choice between these two type of tickets?
-- ALARM tickets are for problems concerning T0 (mainly problem at T1 blocking data acceptance from T0)
-- TEAM tickets for all other problems of importance (mainly T1<->T2 transfers for the moment) Currently in discussion: if the problem is not acknowledged by the site before 2PM the following day, then an ALARM ticket is sent.
Could CMS, ALICE and LHCb explicit the range of use of each tickets?
<big>status of the WMS for Alice</big>15m
Alice wants to fully replace the RBs and only use the WMS in production at all sites.
In Alice's computing model it is recommended (not mandatory) that sites provide a local WMS, though they understand that for some T2 sites this can be very difficult.
Alice would like to requests to T1 sites and in general to all sites providing RBs to Alice, to migrate to the WMS.
Specially the first target sites are NIKHEF and CCIN2P3.
NIKHEF : is providing 2 RBs but no WMS yet
IN2P3: no WMS there supporting Alice. In France there are only 2 at T2 sites: datagrid.cea.fr y lal.in2p3.fr. They would like to request IN2P3 to also provide one.
<big> CREAM CE for Alice (& PPS pilot service) </big>
Alice would like to start using the CREAM CE in production. To do this, Alice has the following requirements on sites:
Keep current LCG CE and install CREAM CE on another box.
Install a 2nd VObox to point to the CREAM CE. VOBox can be in a virtual machine if the site is short of boxes.
Point the CREAM CE to the standard Alice production queue.
Need a GridFTP server somewhere on the site.
This request also presents another opportunity: Any sites that wish to support Alice with the CREAM CE could also support the testing of the new ICE enabled WMS, simply by installing the latest version of the CREAM CE (available in the PPS repositories) rather then the version currently in the production repositories. Sites wishing to do this would also need to configure CMS as a VO on their site - no other action is needed on the part of the site.
Any sites who are interested should contact email@example.com. Installation instructions for CREAM CE will be provided.
Alice would like to ask that all LCG tier-1s (which support the Alice VO) contribute to this task. Alice would also like to invite as many tier-2 sites as possible to join in.
<big>WLCG Service Interventions (with dates / times where known) </big>
the site is LPNHE (part of GRIF):
it is in downtime
but no rss feed has been sent about it.
This could be useful for the CIC people to tune the rss feed, that is the way in which the experiments are retrieving the infos about the downtimes.
<big> CMS report </big>
<big> LHCb report </big>
Any comments from sites concerning last week request about gridmap file for LHCb? If not I will proceed by formulating an EGEE broadcast for all sites to implement this "safe" mapping in case of VOMS mapping failure.
EGEE downtime announcement procedure:
1 Announcement of scheduled downtime with a mail "Announcement" at least 24h in advance as in the MoU.
2. Start of downtime (scheduled and unscheduled) as of the time when it starts with a mail "Start" (with correct time!)
3. End of downtime: mail"End" (with correct time)
(From Philippe) In the last couple of days we tend to receive update notifications from GGUS for tickets that according to the web page were not updated at all (ex #41707, last update was October 3rd but we got mails also recently). Why this happens?
<big> Storage services: Recommended base versions </big>