28-R-15 (CERN conferencing service (joining details below))
28-R-15
CERN conferencing service (joining details below)
Description
grid-operations-meeting@cern.ch Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
Attendees:
OSG operations team
EGEE operations team
EGEE ROC managers
WLCG coordination representatives
WLCG Tier-1 representatives
other site representatives (optional)
GGUS representatives
VO representatives
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0140768
NB: Reports were not received in advance of the meeting from:
ROCs: AP, Italy
VOs: Alice, Atlas, CMS, LHCb
Recording of the meeting
16:00
→
16:00
Feedback on last meeting's minutes
Minutes
16:01
→
16:30
EGEE Items29m
<big> Grid-Operator-on-Duty handover </big>
From: DECH/ UKI
To: SWE/ Russia
Issues:
- 7th-8th GOCDB outage due to a power cut at RAL. No other problems.
<big> PPS Report & Issues </big>
PPS reports were not received from these ROCs:
AP, CE, IT, SEE, SWE Issues from EGEE ROCs:
None reported
Release News:
gLite 3.0.2 PPS Update45 was released to pre-production last Tuesday.
It is currently in phase of pre-deployment testing.
The update contains:
YAIM module for 3.0 WMS to fix the bug of limit on uid for gridftp server
All details in:
https://twiki.cern.ch/twiki/bin/view/EGEE/PPSReleaseNotes_302_PPS_Update45
gLite 3.1.0 PPS Update17 was released to pre-production last Thursday.
It is currently being istalled at PPS sites after pre-deployment testing.
The update contains:
glite-MPI_utils metapackage for gLite 3.1
Improved globus-gridftp startup script
various improvements for glite-info-provider-ldap
lcg_util v1.6.8 (SLC4)
All details in:
https://twiki.cern.ch/twiki/bin/view/EGEE/PPSReleaseNotes_310_PPS_Update17
<big> EGEE issues coming from ROC reports </big>
(ROC France): This site had to change in emergency its domain name from "mrs.grid.cnrs.fr" to "in2p3.fr". A scheduled downtime is ongoing, but all old node names (and IPs) has already been replaced by the new ones into the GOC DB.
During those operations, this site wondered whether this is possible or not to set an alias on a CE node. Is it possible ? Did any other site try this ?
(ROC Russia): I would like to pay your attention at long and unsuccessful history of updates of lcg_util. The new one was issued for PPS. However, the update did not include the patch of lcg-rep and Classic SE (see bug #32999 in Savannah). However, this bug was fixed two week ago.
Maarten Litmaath said that "release to production would be 1 or 2 weeks later" (see "Re: [LCG-ROLLOUT] RM SAM test on CE and Classic SE" in Fri, 8 Feb 2008). So, the sites which applied "update" as recommended operation procedure and still use Classic SE can not work properly during month or so. Is it really so complicated problem to rollback to situation before "updates"? Who can send a recommendation for site administrators to rollback manually at least?
Btw, I think that the story like this may occur in future. I propose to think about rollback procedure on emergency. Manually or automatic.
(ROC SWE): In the last days several sites in the SWE federations are experiencing problems with the Information System. Not clear wether this can be correlated with upgrading to the last version of the m/w or its yaim configuration. Want to raise this in the GridOps meeting to see if sites in other federations are seeing something similar.
<big> gLite Release News</big>
gLite 3.1 Update13 released to production today.
The update contains:
A Major upgrade to dcache (patch#1395)
An updtae from VDT to fix a gridftp issue
voms-admin client for UI and VOBOX
v dcacheVoms2Gplasma required for proxies created with grid-proxy-init
All details in:
http://glite.web.cern.ch/glite/packages/R3.1/updates.asp
<big>Phase out of classic SE</big>5m
Sites/VOs are requested to migrate in the next 3 months, before the end of May. A broadcast will be sent with the details.
A migration to DPM is the suggested solution.
https://twiki.cern.ch/twiki/bin/view/LCG/ClassicSeToDpm
16:30
→
17:00
WLCG Items30m
CCRC'08 Operational Review30m
Weekly review of on-going CCRC'08 activities based on 3 agreed metrics:
Experiments' scaling factors for functional blocks exercised in the challenge