WLCG-OSG-EGEE Operations meeting
→
Europe/Zurich
28-R-15 (CERN conferencing service (joining details below))
28-R-15
CERN conferencing service (joining details below)
Description
grid-operations-meeting@cern.ch
Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
Attendees:
OSG operations team
EGEE operations team
EGEE ROC managers
WLCG coordination representatives
WLCG Tier-1 representatives
other site representatives (optional)
GGUS representatives
VO representatives ROCs: Asia Pacific, Russia, UK/I
VOs:
Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
Attendees:
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0140768
OR click HERE
NB: Reports were not received in advance of the meeting from:
-
- 16:00 → 16:01
-
16:01
→
16:30
EGEE Items 29m
-
<big> Grid-Operator-on-Duty handover </big>From: DECH / Russia
To: South East Europe / Asia Pacific
Report from DECH COD: Please note the following about these sites:- YerPhI (ticket 26634): There is an open item in the operations meeting. The mentioned ticket has been closed over the week but needed to reopened since the SAM tests still frequently fail. The reason is always that no information about the SE found in the SE. It is not obvious if this is cased by a bad network connection or a badly performing inforprovider (at least to me). I propose to re-discuss the item at operations meeting.
- Australia-UNIMELB-LCG2 (ticket 34393): The SE of site seems to (almost) full since weeks. Beginning of the week it looked better and site was put to quarantine. This needed to be reverted since the situation has not changed. If there is again no reply to the ticket this week, I propose to escalate to the operation meeting next week.
- Nothing to report.
-
<big> PPS Report & Issues </big>PPS reports were not received from these ROCs:
AP IT RU SEE UKI
Issues from EGEE ROCs:
- none reported
-
<big> gLite Release News</big>
Release News:
Now in production
gLite 3.1.0 Update42 was released to production with HIGH priority.
The update contains:- FTS
- new version of FTA changing the gridFTP session handling (CCRC08)
- Many services
- lcg-vomscerts-4.9.0 adds next cert for lcg-voms
Now in pre-production
PPS site are now upgrading to gLite 3.1.0 PPS Updates 23 and 24:- WMS LB (SL4): first release to PPS
- Patch for Bugs 31894, 32200, 29600 (security Hole), 32573 (WMS alias)
- UI/WN/VOBOX
- edg-gridftp-client-1.2.8 fixes bugs 33205, 27274
- DPM/LFC v1.6.10
- R3.1/SLC4/x86_64: DPM/LFC v1.6.10 (64bit)
- R3.1/i386/SLC4: GFAL & lcg_util update with 5 bugfixes for CCRC08
- DPM/LFC v1.6.10
- DICOM back-end service for DPM
- re-buildable source RPMs
- support for MacOSX
- group writable directories when SRM started with umask 0
- bug fixes
- CE
- patch to Globus job manager to improve performances
- FTS
- new version of FTA changing the gridFTP session handling (CCRC)
- Many services
- lcg-vomscerts-4.9.0 adds next cert for lcg-voms
Soon in production
gLite 3.1.0 PPS Updates 20 in preparation.
The update, to be released tomorrow, will contain:- WMS LB (SL4): first release to PPS
- Patch for Bugs 31894, 32200, 29600 (security Hole), 32573 (WMS alias)
- UI/WN/VOBOX
- DPM/LFC v1.6.10
- R3.1/i386/SLC4: GFAL & lcg_util update with several bugfixes (some of them are requested for CCRC08)
- DPM/LFC v1.6.10
- DICOM back-end service for DPM
- re-buildable source RPMs
- support for MacOSX
- group writable directories when SRM started with umask 0
- bug fixes
- CE
- patch to Globus job manager to improve performances
- FTS
-
<b>Next CIC portal release - IMPORTANT CHANGESNext release of the CIC portal is scheduled for Tuesday 22/04. The portal will be offline between 09:00 UTC and 09:30 UTC to allow a safe transition. *This release includes many changes in the global design of the portal* Menus have been reorganized in order to reduce their size, group functionalities and improve user-friendliness. We are aware that such drastic changes can be disturbing. We'll be happy to help and answer any question you may have on the new interface. Please address any comment to cic-information@in2p3.fr, or use the "contact us" section of the portal.
-
<big> EGEE issues coming from ROC reports </big>
- None in this week's ROC reports.
-
-
16:30
→
17:00
WLCG Items 30m
-
<big> WLCG issues coming from ROC reports </big>
- ROC action follow up on behalf of the VO. Some time ago ATLAS asked ROCs to follow-up with sites action on proper version of WNs at sites and 100GB disk space in SW area. Could we ask VOs for assigning a ROC with a ticket in such cases? This was used in past e.g.: https://gus.fzk.de/ws/ticket_info.php?ticket=28806 This reduces number of hops to reach the site, the VO can observe progress and sites/ROCs can ask questions to VO. In CE we would like to discuss some issues with the VO contact for this action. Points to discuss are: 1) Could the VO make critical the test which wants the ROCs to follow up? 2) Could the VO provide documentation for the tests as it is for OPS VO? For some sites the relevant SAM tests are failing and we don t know why. It is also not clear if 100GB is required free space or it is enough to have 100GB space in total for atlas VO.
-
<big>WLCG Service Interventions (with dates / times where known) </big>Link to CIC Portal (broadcasts/news), scheduled downtimes (GOCDB) and CERN IT Status Board
- Alias for IN2P3-CC local LFC will change on Thursday April 24th 09:00 UTC
from: lfc-atlas.in2p3.fr
to: lfc-prod.in2p3.fr
Old alias lfc-atlas.in2p3.fr will:
- disappear from information system on next thursday
- still be available until end of may
This change will be transparent.
- The Classic SEs at IN2P3-LPC are planned to be removed from production the 15th May:
- clrauvergridse01.in2p3.fr
- clrlcgse02.in2p3.fr
Please backup your data before that date.
- ASGC's circuit provider will perform a maintenance on following two links.
* TW(Taipei) - US(Chicago) - NL(Amsterdam) 2.5Gbps
* TW(Taipei) - NL(Amsterdam) 10Gbps
Start time: 2008-04-23 00:00 UTC
End time: 2008-04-23 02:00 UTC
Impact: ASGC will use alternative route path (via our peers) to T0/T1.
- The old Edinburgh site, ce.epcc.ed.ac.uk will be retired from use in two weeks time (1 May 2008). Storage services, via srm.epcc.ed.ac.uk, will be accessible via the new Edinburgh site, ce.glite.ecdf.ed.ac.uk for some time after this, although the intention is to slowly migrate to newer storage.
This means that support for several VOs will be dropped by Edinburgh, as they are not part of UKI-SCOTGRID-ECDF's supported VO list. In particular, these vos are:
alice, babar, biomed, cdf, cms, dzero, esr, fusion, geant4, hone, magic, minos, na48, planck, sixt, t2k and zeus - At the start of May, the site egee.man.poznan.pl will be removed from production and shut down. Please backup your data stored on storage elements belonging to this site.
Time at WLCG T0 and T1 sites. - Alias for IN2P3-CC local LFC will change on Thursday April 24th 09:00 UTC
-
<big> CCRC'08 Operational Review </big>Speaker: Harry Renshall / Jamie Shiers
-
<big> Alice report </big>
-
<big> Atlas report </big>The sites in the list below still haven't upgraded to the ATLAS requested version of lcg-utils (1.6.7 (SL4)):
- France ROC:
- AUVERGRID
- IN2P3-LPC
- SEE ROC:
- GR-03-HEPNTUA
- HG-04-CTI-CEID
- WEIZMANN-LCG2
- Italian ROC:
- INFN-FIRENZE
- INFN-LNS
- INFN-NAPOLI
- INFN-NAPOLI-PAMELA
- INFN-ROMA3
- NE ROC:
- PDC
- CERN ROC:
- TORONTO-LCG2
- UK/I ROC:
- UKI-LT2-IC-LeSC
- France ROC:
-
<big> CMS report </big>
- News on Development:
- Data certification, Processing at the T0:
- Re-processing:
- MC production:
- Data Transfers and Integrity, DDT-2/LT status:
- LINKs:
Speaker: Daniele Bonacorsi - News on Development:
-
<big> LHCb report </big>
-
-
17:00
→
17:30
OSG Items 30mSpeaker: Rob Quick (OSG - Indiana University)
-
Discussion of open tickets for OSG
-
- 17:30 → 17:35
-
17:35
→
17:36
AOB 1m
- Item 1