28-R-15 (CERN conferencing service (joining details below))
CERN conferencing service (joining details below)
email@example.com Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
OSG operations team
EGEE operations team
EGEE ROC managers
WLCG coordination representatives
WLCG Tier-1 representatives
other site representatives (optional)
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0148141
OR click HERE (Please specify your name & affiliation in the web-interface)
From: ROC Italy and ROC France
To: ROC Russia and ROC UK/I
Report from ROC Italy :
Case transferred to Operation Meeting: GGUS #40700 on BEIJING-CNIC-LCG2-IA64. APEL problem not solved yet, case opened on Sept. 10th.
Case transferred to Operation Meeting: GGUS #42770 IN-DAE-VECC-02
trasferred to political instance, but after a site feedback returned to 2nd mail.
Report from ROC France :
We have observed that Follow-up of last escalation step by OCC and ROC was not correctly done.
More details here : https://twiki.cern.ch/twiki/bin/view/EGEE/OperationalUseCasesAndStatus#9_Last_escalation_step_Site_susp
We have 2 cases where the last step lasts more than one month:
GGUS #40521: RU-Phys-SPbSU (1 month and a half)
25/09/2008: last escalation step
06/10/2008: raised at WLCG Ops meeting
06/11/2008: still in last step and not suspended
06/11/2008: Cyril L'Orphelin (COD-FR) send mail to Maite, Steve and Nick
06/11/2008: Maite sent mail to Russian ROC
06/11/2008: site suspended by Russian ROC
GGUS #42015: ITPA-LCG2 (3 weeks)
24/10/2008: last escalation step
27/10/2008: raised at WLCG Ops meeting
03/11/2008: raised again at WLCG Ops meeting
07/11/2008: still in last step and not suspended
<big> PPS Report & Issues </big>
Please find Issues from EGEE ROCs and general info in:
Problem with WMS : GGUS #42999 .
The WMS is not usable in production and it blocks the setup of ALICE WMS .
ROC SEE: We would like to point out that we identified a serious deployment problem for 64-bit WNs (missing x86_64 RPMs and mix-up of executables and libraries for 32-bit and 64-bit architectures):
Due to this problem we had to take manual steps to resolve the issue and were failing SAM tests for several days, which will affect our availability. The error message was very misleading.
ROC SWE: The site BDII of the SWE site IFIC-LCG2 does not appear on the SAM test anymore ( GGUS #43353 ; Savannah #33616 ). Any update on this problem?
<big> RAL-LCG2 batch farm occupancy </big>
The RAL-LCG2 batch farm has been running at 50% occupancy or less since June. For October, the nominal LHC VO's total fairshare of the farm was ~73%, but we only saw ~14% utilisation by the LHC VOs. Total occupancy for October was about 34%, with non-LHC VOs (mainly BaBar, biomed, phenogrid) contributing the rest. (Occupancy is measured as utilised KSI2K divided by total KSI2K capacity.)
We would like to find out whether or not the experience of other T1s has been similar over the last few months, or if the lack of LHC work is specific to RAL-LCG2 and we should investigate further.
<big> WLCG issues coming from ROC reports </big>
No items this week.
<big>WLCG Service Interventions (with dates / times where known) </big>