WLCG-OSG-EGEE Operations meeting
28-R-15
CERN conferencing service (joining details below)
Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
Attendees:
To dial in to the conference:
a. Dial +41227676000
b. Enter access code 0148141
OR click HERE
(Please specify your name & affiliation in the web-interface)
-
-
16:01
→
16:30
EGEE Items 29m
-
<big> Grid-Operator-on-Duty handover </big>From: Italy and France
To: Russia and UK/I
Report from Italy :- List of unresponsive sites (escalated to political instances)
- First Ops meeting (OCC involved): no sites to report
- Second Ops meeting (assigned to OCC):
- SITE NAME: SDU-LCG2 (ROC CERN); GGUS: 45181; Reason: no response from site admin in the last week.
- Problems Encountered during shift:
Nothing to report - Information for the new COD team:
Nothing to report.
- No report.
- List of unresponsive sites (escalated to political instances)
-
<big> PPS Report & Issues </big>Please find Issues from EGEE ROCs and general info in:
https://twiki.cern.ch/twiki/bin/view/LCG/OpsMeetingPpsSUMMARY:
- Definition of "early adopters" of gLite releases (Staged roll-out). Still need several services covered and looking for volunteers.
More info about the release testing (early adoption) process and the relevant interfaces can be read at: https://twiki.cern.ch/twiki/bin/view/LCG/PPS_Release_Testing List o fServices not covered
The list of services for which volunteers are needed is:- glite-WN (plain and re-locatable)
- glite-UI (plai and re-locatable)
- glite-TORQUE_client
- glite-TORQUE_server
- glite-TORQUE_utils
- glite-CONDOR_utils
- glite-LSF_utils
- glite-SGE_utils
- glite-MON
- glite-SE_dpm_disk
- glite-MON (registry)
- glite-MPI_utils
- glite-FTA_oracle
- glite-FTM
- glite-FTS_oracle
- glite-SE_dcache_admin_gdbm
- glite-SE_dcache_admin_postgres
- glite-SE_dcache_info
- glite-SE_dcache_pool
- glite-LFC_mysql
- glite-LFC_oracle
- glite-WMS
- glite-LB
- glite-CREAM_ce
- glite-SE_dpm_mysql
- glite-PX
- Pilot service of glexec/SCAS started
- kick-off meeting with sites and experiments concerned held on the 5th
- Minutes in http://indico.cern.ch/conferenceDisplay.py?confId=49840
- the controlled roll-out of the glexec/SCAS functionality over two T1 sites was decided
- FZK (Karlsruhe) will start the installation on the 9th-Feb
- After a first phase of testing by LHCb and Atlas, IN2P3(Lyon) will step in
- Details about the pilot (planning, layout, technical info) can be found in the page https://twiki.cern.ch/twiki/bin/view/LCG/PpsPilotSCAS
- Details about the single tasks can be found in the tracker http://www.cern.ch/pps/index.php?dir=./ActivityManagementSA1DeploymentTaskTracking specifically listing the subtasks of TASK:8986
- Definition of "early adopters" of gLite releases (Staged roll-out). Still need several services covered and looking for volunteers.
-
<big> gLite Release News</big>Please find gLite release news in:
https://twiki.cern.ch/twiki/bin/view/LCG/OpsMeetingGliteReleasesNow in Production
4th Feb: gLite 3.1 Update 40 and of gLite 3.0 Update45 were released to production. The updats contain an upgrade of lcg-vomscerts-5.3.0. They add 3 new host certificates:- cclcgvomsli01.in2p3.fr (biomed + egeode);
- next cert for vo.racf.bnl.gov (atlas);
- cert for voms.fnal.gov (cms).
Now in PPS
3rd Feb: gLite 3.1 PPS Update 43 went through the PPS deployment test and is now been installed by the remaining PPS sites. The update contains:- WMS 3.1.102 fixing WMS 3.1.100 already in PPS (PATCH:2562)
- Upgrade of lcg-vomscerts-5.3.0. (already deployed in production) (PATCH:2745 and PATCH:2746)
- Bugs fixes for WMS UI 3.1 (PATCH:2622)
- WN: grid-cm-* packages provide worker node configuration monitoring published on the Active MQ messaging system (PATCH:2660 PATCH:2661)
- Upgrade of BDII. The starting cache size used for the Berkeley Database in the BDII has been reduced from 1 GB to 50 MB. This should significantly reduce the memory footprint and still provide the necessary performance. (PATCH:2671)
- Dependency on mysql-server added to VOMS_mysql (PATCH:2700)
- New Information Dynamic Plugin and SGE yaim utils fix a vulnerability (http://www.gridpp.ac.uk/gsvg/advisories/advisory-43233.txt)
Soon in Production
Nothing to report. -
<big> EGEE issues coming from ROC reports </big>
- Italy: [FOR INFORMATION] INFN-T1 FTM endpoint (FTM endpoint at CNAF): http://tier1.cnaf.infn.it/ftmmonitor/transfer-monitor-report/
We also use the FTS monitor tool developed by in2p3, available at: http://tier1.cnaf.infn.it/ftsmonitor/
- Italy: [FOR INFORMATION] INFN-T1 FTM endpoint (FTM endpoint at CNAF): http://tier1.cnaf.infn.it/ftmmonitor/transfer-monitor-report/
-
<big>Grid Service Interventions </big>Link to CIC Portal (broadcasts/news), scheduled downtimes (GOCDB) and CERN IT Status Board
Many interventions scheduled this week. Please consult the URLs above for details.
-
-
16:30
→
17:00
WLCG Items 30m
-
<big> WLCG issues coming from ROC reports </big>
- None this week.
-
<big> Plan for rolling out the recommendations of the WLCG Installed Capacity document </big>
-
<big> Wiki page containing FTM Endpoints </big>Can all tier-1 sites please keep the list of FTM endpoints up to date. The list is here: https://twiki.cern.ch/twiki/bin/view/LCG/LCGFTMEndpoints
Note: This requirement will be replaced by information providers publishing the end-points into the information system.
-
<big> WLCG Operational Review </big>https://twiki.cern.ch/twiki/bin/view/LCG/WLCGDailyMeetingsWeek090202
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGDailyMeetingsWeek090209Speaker: Harry Renshall / Jamie Shiers -
<big> Alice items </big>
-
<big> Atlas items </big>
-
<big> CMS items </big>
- Please have a look at the daily reports given at WLCG daily calls here.
Speaker: Daniele Bonacorsi -
<big> LHCb items </big>
-
<big> WLCG service recommended baseline versions </big>The recommended baseline versions can be found here: https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions
-
-
17:00
→
17:30
OSG Items 30mSpeaker: Rob Quick (OSG - Indiana University)
-
Discussion of open tickets for OSGInformation taken from the weekly escalation reports.
A reminder was sent to GGUS developer for progress on GGUS-OSG ticket update flow testing based on ggus #45488
Rob or Kyle or other OSG supporter to re-prompt Felipe Silva to answer on ggus #45094. The submitter doesnot accept the suggestion to close the ticket.
-
-
17:30
→
17:35
Review of action items 5m
-
17:35
→
17:36
AOB 1m
-
16:01
→
16:30