28-R-15 (CERN conferencing service (joining details below))
CERN conferencing service (joining details below)
firstname.lastname@example.org Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
OSG operations team
EGEE operations team
EGEE ROC managers
WLCG coordination representatives
WLCG Tier-1 representatives
other site representatives (optional)
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0140768
From: France / Central Europe
To: DECH / CERN Report from France COD:
For information: A problem with Gstat has caused many false alarms related to BDII tests on every ROCs. These failed tests to BDIIs were caused by transitorily ASGC network outage for 20 minutes from 06:05 till to 06:25 on 11-Aug-2008.
Report from Central Europe COD:
No issues this week.
<big> PPS Report & Issues </big>
None this week.
<big> gLite Release News</big>
Now in Production
gLite3.1 Update28. The release contains:
glite-CONDOR_utils for lcg-CE(PATCH:1856)
New version of gsoap plugin with a vulnerability fix (affecting LB, WMS, UI, WN, VOBOX, CE)(PATCH:1846)
Several bug fixes on WMS and clients (PATCH:1780)
New Short Lived Credential Service (SLCS), allowing to get short-lived personal certificate based on Shibboleth AAI identity (PATCH:1693)
MyProxy? version 1.6.1-7 (fixes build issue related to globus flavour, already deployed in production) (PATCH:1978)
Various improvements on lcg-extra-jobmanagers (CE) (PATCH:1942)
GFAL and lcg_util update with new function gfal_removedir and Several bug fixes
FTS SL4 release (32 and 64 bit) This version has a critical bug and should not be installed. The RPMs have been removed from the repository.
Now in PPS
No new updates since last week.
Soon in Production
gLite3.1 Update 29 in preparation. The release contains:
CNAF [OUTAGE]: CASTOR upgrade. From Tuesday, 19 August, 09:00 UTC+2 to Wednesday, 20 August, 20:00 UTC+2. Affected nodes:
DESY [at risk]: One poolnode will move its location. Some files in dq2 and user directories will not be available. From: Tuesday, 19 August, 10:00 UTC+2 to Thursday 21 August 21:00 UTC+2. Affected nodes:
CSCS [OUTAGE]: Replacement of a faulty DIMM on storage pool node. From: Tuesday 19 August, 11:30 UTC+2; To: Tuesday 19 August, 13:30 UTC+2. Affected nodes:
GRIF [OUTAGE]: electrical maintenance.
From: Thursday 21 August 23:11 UTC+2;
To: Wednesday 27 August 21:117:30 UTC+2.
CRUZET-4: It is a slow start of CRUZET-4 atm (day-1 today). HCAL and DT are in, Tracker may join in the afternoon. DAQ currently is addressing some issues seen. From the computing standpoint, we have regular data operations shifts in place and operational - focusing mostly on T0 workflows - and we are using the CRUZET-4 exercise to implement the general computing shift design put in place recently, which is supposed to complement and integrate the DataOps approach and extend it to monitor the overall infrastructure, interfacing with the Grid Ops and the distributed facilities.
Summer08 production: More details will follow from DataOps team. Anyway, the most urgent and needed info by T1 sites has been already provided to them at the end of last week (they need it to prepare tape families on their MSS systems); current storage needs estimated to be as follows:
ASGC: 27.0 TB (RAW) + 13.5 TB (RECO) = 40.5 TB
CNAF: 26.5 TB (RAW) + 13.25 TB (RECO) = 39.75 TB
FNAL: 64.6 TB (RAW) + 32.3 TB (RECO) = 96.9 TB
FZK: 58.8 TB (RAW) + 29.4 TB (RECO) = 88.2 TB
IN2P3: 22.0 TB (RAW) + 11.0 TB (RECO) = 33.0 TB
PIC: 8.4 TB (RAW) + 4.2 TB (RECO) = 12.6 TB
RAL: 23.9 TB (RAW) + 11.95 TB (RECO) = 35.85 TB
<big> LHCb report </big>
<big> Storage services: Recommended base versions </big>