Ops Meeting 12th June 2018 Chair: Jeremy Coles Minutes: Gordon Stewart Attending: Alessandra Forti, Brian Davies, Daniel Traynor, David Crooks, Gareth Roy, Gordon Stewart, Jeremy Coles, John Hill, Kashif Mohammad, Matt Doidge, Pete Gronbech, Raul Lopez, Robert Frank, Sam Skipsey, Steve Jones, Winnie Lacesso Experiment Reports ================== LHCb - No report. CMS - No report. ATLAS - Elena has uploaded a report to Indico. - Alessandra will follow-up with Mark regarding Birmingham EOS migration. - Peter has notified Alessandra that Singularity is enabled at Lancaster; Alessandra will check that everything is working. - Gareth mentioned a recent Glasgow ticket regarding deletion failures (https://ggus.eu/?mode=ticket_info&ticket_id=135621). The shifter does not appear to understand what is going on, so Gareth was hoping UK ATLAS could follow-up. Alessandra will take a look. Other VOs - No updates to incubator pages. GridPP DIRAC - All status pages look fine. Meetings and Updates ==================== General Updates --------------- - Pre-GDB today on benchmarking. - GDB tomorrow examining outcomes of workshops on dCache and archival storage, plus DOME update and security items. - HEPSYSMAN is next Monday and Tuesday. Pete reminded attendees that a site report is expected. Robin will be looking after the hackathon. - No update on LSST DESC data challenge. - DPM 1.10.2 update provides broken DOME configuration file in some circumstances on CentOS 6 (packaging is at fault, providing file named .conf rather than .example). Recommendation is not to update until packaging has been fixed, and indeed not to auto-update DPM installations. CentOS 7 is unaffected. - Delay in VM accounting records being processed by APEL. Repository is being updated. Delays likely until the end of the week. - GOCDB has been exhibiting slowness due to issues on back-end. - Sites noted in May AR report should respond soon, please. - WLCG Weekly Ops meeting: - Stable for ATLAS. - Test Spanish cloud moving to Harvester WL management. Alessandra noted ATLAS has several ways to submit to its grid resources, and is trying to unite these as plug-ins under a new component called Harvester. WLCG Operations Co-ordination ----------------------------- Discussion regarding WLCG-EGI interaction taking place on Thursday. Tier-1 Status ------------- Continuing problems with small VOs and CASTOR. Storage and Data Management --------------------------- No report. Tier-2 Evolution ---------------- No report. Accounting ---------- No report. Documentation ------------- No report. Interoperation -------------- Kashif has taken on interoperation roll from David Crooks. Annual review of information on-going; UK is still in progress. Jeremy will check responses from UK sites. Final call for WMS decommissioning (all UK WMSs have already gone). Next meeting is 9th July. Monitoring ---------- No report. On-duty ------- Dashboard has been struggling and exhibiting transient sluggishness, but everything seems to be back to normal now. Security -------- David noted there was nothing particular to flag up. SOC workshop in a couple of weeks: UK presence is strong, and it looks like there will be attendees from JISC, too. Preparations are on-going. Services -------- No report. Tickets ------- IPv6 tickets dominate UK list. Some tickets could be closed: - QMUL LHCb - Bristol CMS - RHUL regarding weekend issues with pilots Tools ----- No report. Discussion ========== Matt asked for input regarding HEPSYSMAN hackathon to be sent to Robin / Pete. Matt suggested trying to recreate new user experience. Main GridPP 41 meeting starts at 13.00 on 29th August. Actions & AOB ============= Actions ------- *** Remember to update count on actions page when you take minutes. *** O-171031-01 No update. O-171031-03 No update. O-170711-04 No update. O-170711-07 No update. O-170131-01 No update. O-160524-02 No update. O-161108-00 No update. AOB --- Steve noted that SKA has added new VOMS servers at Oxford and Imperial. No meeting next week due to HEPSYSMAN. The next ops meeting will be Tuesday 26th June. Chat Window =========== Jeremy Coles: (12/06/2018 11:04:04) Gordon is taking minutes today - thanks Gordon. Gareth Douglas Roy: (11:13 AM) https://ggus.eu/?mode=ticket_info&ticket_id=135621 Jeremy Coles: (11:14 AM) https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T2_UK_London_Brunel Samuel Cadellin Skipsey: (11:25 AM) [to be clear re DPM 1.10.x updates, the "broken: package is the SL6 release - ironically, everyone on Centos7 should be able to upgrade to 1.10 safely...] raul: (11:28 AM) So... two weeks ago DPM had problems on CentOS7, but it was fine on CentOS6. Now... Samuel Cadellin Skipsey: (11:32 AM) Strictly: SRM had issues under Centos7 and, strictly, if you had already enabled DOME, there were no issues updating to 1.10 on SL6 ;) raul: (11:34 AM) ARe UK sites upgrading to DOME? Samuel Cadellin Skipsey: (11:34 AM) The DPM developers would like you to. (but they've wanted people to since 1.9.x, so...) Daniel Traynor: (11:38 AM) I've closed it now Alessandra Forti: (11:38 AM) Transistion to DOME needs to be organized IMO. infact moving all sites to 1.10 needs to be organised. raul: (11:39 AM) Any gains in DOME for sites like the ones we have? Samuel Cadellin Skipsey: (11:39 AM) All transfers which don't use the SRM become more efficient. Alessandra: transition to DOME, or turning off SRM? Alessandra Forti: (11:44 AM) Both. At the DPM workshop it didn't seem anyone was using it in production