Operations team & Sites

Europe/London
EVO - GridPP Operations team meeting

EVO - GridPP Operations team meeting

Description
- This is the weekly GridPP ops & sites meeting - The intention is to run the meeting in Vidyo: https://vidyoportal.cern.ch/flex.html?roomdirect.html&key=zXhsqAxVnaT6 -- The PIN is 1234. To join via phone see http://information-technology.web.cern.ch/services/fe/howto/users-join-vidyo-meeting-phone for dial in numbers. -- The London (UK) service is on +44 (0)161 306 6802. Phone bridge ID 1001002 -- The meeting extension is 109308582. PIN 1234 Chair: Jeremy Coles Minutes: Apologies:

Attending:
Andrew Lahiff
Andrew McNab
Daniela Bauer
Dan Traynor
Elena K
Gareth Roy
David Crooks
Gordon Stewart
Govind S
Ian Loader
John Bland
Jeremy Coles
John Hill
Kashif M
Luke Kreczko
Marcus Ebert
Matt Williams
Oliver Smith
Winnie Lacesso
Robert Frank
Sam Skipsey
Steve Jones
Terry Froy
Tom Whyntie

 


Experiment problems/issues 19m

Review of weekly issues by experiment/VO

    LHCb: Fairly quiet (John) T2D issues (Andrew)

    CMS: (Daniela) Problems with Brunel SRM, being worked on
    https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T2_UK_London_Brunel

    ATLAS: Birmingham, webdav support, being worked on, Alessandra gave report on SL7 at Atlas meeting, conclusion was, don't use it (yet). Alessandra's slides are attached to meeting agenda.

    Other: Updates should be recorded in https://www.gridpp.ac.uk/wiki/GridPP_VO_Incubator.

    GridPP DIRAC status [Andrew McNab]
    -- https://www.gridpp.ac.uk/gridpp-dirac-sam
     Looks fine, Oxford VAC still down ?

Meetings & updates 20m (Jeremy):
 The final A/R figures for WLCG Tier-2 sites in July have been made available.
Very little activity reports at Monday's ops meeting. The CERN-Wigner link was down for a while with little impact reported.
Another request - Retrieving information for your NGI: test the new VAPOR release. See GGUS 123214.
https://indico.cern.ch/event/562628/

Storage (Sam): Look out for his presentation at GridPP37.
Monitoring: Andrew: Dashboard somewhat erratic last week.
Jeremy: Please fill in doodle poll for ROD duty.

Tickets:

VOMS servers
https://ggus.eu/?mode=ticket_info&ticket_id=123333 (9/8)
After the blip with the VOMS servers last week Daniela opened this ticket - it looks like the problem is fixed now, and this ticket can be closed. Assigned (17/8)
[Can be closed]

BRISTOL
https://ggus.eu/?mode=ticket_info&ticket_id=123419 (16/8)
A low availability ticket for Bristol, which has Winnie a little confused as the "created_at" date for the issue is back in July - Winnie asks for confirmation that this is actually an old, stale issue - the last 3 weeks of tests look good for Bristol, but the nagios link cuts off mid-July. Waiting for reply (16/8)
[The dashboard not updating is a bug, dashboard people promised to fix it -- Daniela]

QMUL
https://ggus.eu/?mode=ticket_info&ticket_id=123400 (15/8)
Another low availability ticket - following the nagios link here I see a lot of "-1.00" entries - which I think are caused by tests returning unknown statuses - Daniela is rightfully suspicious of it. There may be clues in this talk [1], but probably worth doing as suggested and just On Holding the ticket. Assigned (15/8)
[CE problems - understood]
https://indico.egi.eu/indico/event/2808/contribution/1/material/slides/0.pdf

A few tickets that could do with an update:
https://ggus.eu/?mode=ticket_info&ticket_id=117683 - Glue 2 for Castor
https://ggus.eu/?mode=ticket_info&ticket_id=122364 - cvmfs support for solidexperiment.org (looks like it's nearly done).
[This is waiting for solid.]

And finally, so long JET:
https://ggus.eu/?mode=ticket_info&ticket_id=122198 (17/6)
EFDA-JET's decommissioning date is the 25th.

Kashif to update gridpp nagios documentation with ARGO documentation.

https://www.gridpp.ac.uk/wiki/GridPP_VO_Incubator
Discussing VOI-GEN-001: Is a probably a non-issue, can be closed.
VOI-GEN-002: Can be closed (Tom W)

vo.DiRAC.ac.uk: Wait for end of summer holidays.
In fact for most VOs, that's the case.
GalDyn: No more updates expected (Tom). Archive.
Ligo: No recent updates from Ligo, Catalin in contact. Use US glideinWMS to submit to UK resources ?
LOFAR (done ?)
LSST:
Marcus Ebert: (11:37 AM)
Sorry, my mic was not working. For LSST, there is only one person running jobs and there was no new jobs in the last time. also direct cli was not preferred by him compared to the ganga interface developed. Will see how this continues once more test jobs for other workflows are needed.
LZ: (Elena) running production, Lancaster enabled. Alex Richards at Imperial working on Job Submission interface,
DUNE: (Elena) If someone else could enable dune that would be helpful, so that dune gets a 'grid' presence.
Pravda: Matt Williams: (11:41 AM)
Microphone not working but no update for Pravda. I'm not sure that they're activiely doing analysis at the moment.

Snoplus: (Daniela): We did some test runs with condor and dirac. It works in principle, but it is an awful hack and we (Daniela & Simon) are reluctant to deploy this on our production server just yet.
Lukasz Kreczko: (11:45 AM)
all you need is APEL accounting for HTCondor-CE nto make it work here

https://www.gridpp.ac.uk/wiki/Operations_Team_Action_items
[I think I updated them all.]


From the sidebar:
Terry Froy: (11:47 AM)
QM will happily host a RIPE Anchor.

Jeremy Coles: (11:50 AM)
https://indico.cern.ch/event/556609/timetable/

 

There are minutes attached to this event. Show them.