Operations team & Sites
EVO - GridPP Operations team meeting
Attending:
Andrew Lahiff
Andrew McNab
Daniela Bauer
Dan Traynor
Elena K
Gareth Roy
David Crooks
Gordon Stewart
Govind S
Ian Loader
John Bland
Jeremy Coles
John Hill
Kashif M
Luke Kreczko
Marcus Ebert
Matt Williams
Oliver Smith
Winnie Lacesso
Robert Frank
Sam Skipsey
Steve Jones
Terry Froy
Tom Whyntie
Experiment problems/issues 19m
Review of weekly issues by experiment/VO
LHCb: Fairly quiet (John) T2D issues (Andrew)
CMS: (Daniela) Problems with Brunel SRM, being worked on
https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T2_UK_London_Brunel
ATLAS: Birmingham, webdav support, being worked on, Alessandra gave report on SL7 at Atlas meeting, conclusion was, don't use it (yet). Alessandra's slides are attached to meeting agenda.
Other: Updates should be recorded in https://www.gridpp.ac.uk/wiki/GridPP_VO_Incubator.
GridPP DIRAC status [Andrew McNab]
-- https://www.gridpp.ac.uk/gridpp-dirac-sam
Looks fine, Oxford VAC still down ?
Meetings & updates 20m (Jeremy):
The final A/R figures for WLCG Tier-2 sites in July have been made available.
Very little activity reports at Monday's ops meeting. The CERN-Wigner link was down for a while with little impact reported.
Another request - Retrieving information for your NGI: test the new VAPOR release. See GGUS 123214.
https://indico.cern.ch/event/562628/
Storage (Sam): Look out for his presentation at GridPP37.
Monitoring: Andrew: Dashboard somewhat erratic last week.
Jeremy: Please fill in doodle poll for ROD duty.
Tickets:
VOMS servers
https://ggus.eu/?mode=ticket_info&ticket_id=123333 (9/8)
After the blip with the VOMS servers last week Daniela opened this ticket - it looks like the problem is fixed now, and this ticket can be closed. Assigned (17/8)
[Can be closed]
BRISTOL
https://ggus.eu/?mode=ticket_info&ticket_id=123419 (16/8)
A low availability ticket for Bristol, which has Winnie a little confused as the "created_at" date for the issue is back in July - Winnie asks for confirmation that this is actually an old, stale issue - the last 3 weeks of tests look good for Bristol, but the nagios link cuts off mid-July. Waiting for reply (16/8)
[The dashboard not updating is a bug, dashboard people promised to fix it -- Daniela]
QMUL
https://ggus.eu/?mode=ticket_info&ticket_id=123400 (15/8)
Another low availability ticket - following the nagios link here I see a lot of "-1.00" entries - which I think are caused by tests returning unknown statuses - Daniela is rightfully suspicious of it. There may be clues in this talk [1], but probably worth doing as suggested and just On Holding the ticket. Assigned (15/8)
[CE problems - understood]
https://indico.egi.eu/indico/event/2808/contribution/1/material/slides/0.pdf
A few tickets that could do with an update:
https://ggus.eu/?mode=ticket_info&ticket_id=117683 - Glue 2 for Castor
https://ggus.eu/?mode=ticket_info&ticket_id=122364 - cvmfs support for solidexperiment.org (looks like it's nearly done).
[This is waiting for solid.]
And finally, so long JET:
https://ggus.eu/?mode=ticket_info&ticket_id=122198 (17/6)
EFDA-JET's decommissioning date is the 25th.
Kashif to update gridpp nagios documentation with ARGO documentation.
https://www.gridpp.ac.uk/wiki/GridPP_VO_Incubator
Discussing VOI-GEN-001: Is a probably a non-issue, can be closed.
VOI-GEN-002: Can be closed (Tom W)
vo.DiRAC.ac.uk: Wait for end of summer holidays.
In fact for most VOs, that's the case.
GalDyn: No more updates expected (Tom). Archive.
Ligo: No recent updates from Ligo, Catalin in contact. Use US glideinWMS to submit to UK resources ?
LOFAR (done ?)
LSST:
Marcus Ebert: (11:37 AM)
Sorry, my mic was not working. For LSST, there is only one person running jobs and there was no new jobs in the last time. also direct cli was not preferred by him compared to the ganga interface developed. Will see how this continues once more test jobs for other workflows are needed.
LZ: (Elena) running production, Lancaster enabled. Alex Richards at Imperial working on Job Submission interface,
DUNE: (Elena) If someone else could enable dune that would be helpful, so that dune gets a 'grid' presence.
Pravda: Matt Williams: (11:41 AM)
Microphone not working but no update for Pravda. I'm not sure that they're activiely doing analysis at the moment.
Snoplus: (Daniela): We did some test runs with condor and dirac. It works in principle, but it is an awful hack and we (Daniela & Simon) are reluctant to deploy this on our production server just yet.
Lukasz Kreczko: (11:45 AM)
all you need is APEL accounting for HTCondor-CE nto make it work here
https://www.gridpp.ac.uk/wiki/Operations_Team_Action_items
[I think I updated them all.]
From the sidebar:
Terry Froy: (11:47 AM)
QM will happily host a RIPE Anchor.
Jeremy Coles: (11:50 AM)
https://indico.cern.ch/event/556609/timetable/