Ops 5/3/19 Attendees: Brian, Dan T, Daniela, Darren M, David C, Duncan, Elena, Emanuele, Gareth R, Gordon, Ian L, John H, Linda, Mark S, Winnie, Rob C, Robert F, Sam, Ste, Teng, Raul, Kashif, Vip Apologies: Raja, Alessandra Chair, Minutes: Matt LHCB - no problems CMS -Raul still struggling with DOME Otherwise things going smoothly Atlas - site jamboree ongoing. No cloud support meeting this week, WLCG meeting yesterday - frontier squid unavailable. Fixed - logs didn't rotate. Lightweight sites discussed at a previous cloud support meeting, minutes linked to agenda. Will be discussed at jamboree and things will be fed back to gridpp. Tickets - ECDF analysis queue ticket, SL6 queue disabled and C7 queue set online. Issue likely resolved, Elena will double check. Lancaster ticket RALPP - Ipv6 transfer ticket. Looking resolved. TIER 1 - Singularity ticket, discussed regularly. Needs an update in the ticket. Permission of scratchdisk ticket, long discussion on ticket. Will fix QM SL7 queues shortly. All SL6 queues cleaned up for Oxford, Ox C7 only. Mark - are we waiting before we can install EOS? ELena - wait until after site jamboree. Mark - at CERN at the moment, will see if there's any sessions that will be useful to attend. Upcoming Meetings: Atlas Jamboree this week: https://indico.cern.ch/event/770307/ GDB Next Week: https://indico.cern.ch/event/739876/ WLCG Ops Coordination this Thursday: https://indico.cern.ch/event/803145/ Tier 1 Darren: all quiet in Tier 1 land. From Bulletin; 5th March 2019 Report for the Experiments Liaison Report (4/03/2019) is here. https://www.gridpp.ac.uk/wiki/Tier1_Operations_Report_2019-03-04 ARC CE’s are all upgraded and running. Loading issue appears to be solved. gdss783 (ALICE disk) crashed and removed from production over the weekend (25th Feb). Back in production on Thursday 28th Feb. gdss776 (LHCb disk) is having ongoing problems. It crashed the previous week and was put back in, it has crashed last week and was put into read-only mode on 1st March. There is a copy of all LHCb data on Echo. Tuesday 26th February there was an intervention to physically move LHCb disk servers so that we could make room for the new deliveries. CPU efficiencies still problematic. Ongoing investigations/work. Storage Raul - reproting to storage list with DOME experiences. DOME 1.12 "alpha" - 6 patches applied. Much more stable. Much faster then 1.10. No xroot crashes. Still problems though. DOME registers a file in namespace as successfully copied but zero sized. CMS having trouble with this. Brute force solution would be to cron removing zero sized files but would rather not do that - being debugged. Raul working with DPM devs. Raul recommends waiting a bit. Sam - 1.12 is coming soon Matt - xroot 4.9 installed at Lancaster to allow xroot checksums RAul- foudn problem with DOMA tests at Lancs/Brunel - test script has bug in it- communicated with Paul Miller, will be fixed. On Duty Quiet - just 1 ticket. Security As discussed at last meeting, working on notes to trace jobs. Wiki link to work so far: Request to take a look, and try it out if it's relevent to you - or anything you'd like to change or add or scripts or tips please let us know. And if you're not using something in that set please share that too. SSC is not that far away. Good time to prepare! HEPSYSMAN - have a day of security @ HEPSYSMAN. Hopefully an afternoon of forensics training and a morning of other stuff. Next week's GDB will focus on SOC workhop. Linda - updated an advisory. https://wiki.egi.eu/wiki/SVG:Advisory-SVG-CVE-2019-5736 (runc, docker container escape. minor update so not circulated). Services - Only 3 issues at Bristol, RALPP and Tier 1, all described in tickets. RHEL6 no longer supported - only affects Manchester. Tickets - Are gone over. Some tickets that need to be closed are closed. Chat window: It's the Matt Doidge show! :) Could someone please grab the attendees list, I've hit my multi-tasking limit. Grabbed a screenshot Ta! https://www.gridpp.ac.uk/wiki/Operations_Bulletin_Latest https://indico.cern.ch/event/739876/ GDB next week https://indico.cern.ch/event/803145/ Quick correction to bulletin: The link to the ATLAS site jamboree is for last year. The one for this year is: https://indico.cern.ch/event/770307/ Why does ATLAS have two agendas for the same meeting: https://indico.cern.ch/event/770307/ Ah, now I see, that was last years! https://www.gridpp.ac.uk/wiki/Tracing_Jobs https://wiki.egi.eu/wiki/SVG:Advisory-SVG-CVE-2019-5736 ill give them a poke /bash only that there is an issue. not discovered why. yes other issue was regarding downloaing mesh config , but after speaking to shawnM we are looking again. sorry, need to go i think we can move on that soon CMS