Operations team & Sites

Europe/London
EVO - GridPP Operations team meeting

EVO - GridPP Operations team meeting

Description

- This is the weekly GridPP ops & sites meeting

- The intention is to run the meeting in Vidyo: https://vidyoportal.cern.ch/flex.html?roomdirect.html&key=zXhsqAxVnaT6

-- The PIN is 1234. To join via phone see http://information-technology.web.cern.ch/services/fe/howto/users-join-vidyo-meeting-phone for dial in numbers.

-- The London (UK) service is on +44 (0)161 306 6802. Phone bridge ID 1001002

-- The meeting extension is 109308582. PIN 1234

Chair:  Jeremy C

Minutes: Brian D

Apologies:

In attendance:

Alesandra Forti
Brian Davies
David Crooks
Elena Korolkova
Linda Cornwall
Matt Doidge
Andrew McNab
Darren Moore

Jeremy coles
John Hill
Rob Currie
Teng Li
Robert Frank
Daniela Baeur
Winnie Page
Antoio Perez
Kashif
Sam Skipsey
Gordon Stewart
Ian Loader
John Hill
Steve Jones
Duncan Rand

 

 

###################################################################
Experiment problems/issues
Review of weekly issues by experiment/VO
    LHCb
AM: No UK issues.
    CMS
    T1: https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL
    T2: https://cms-site-readiness.web.cern.ch/cms-site-DB:
Bristol; has a SAM avaiblity issue. HAd a netowrk issue which is now solved.
readiness/SiteReadiness/HTML/SiteReadinessReport.html#T2_UK_London_Brunel
DB:
xrootd brunel imeprial have issues. thought to be an ipv6 at CERN issue
https://ggus.eu/index.php?mode=ticket_info&ticket_id=136806
https://ggus.eu/?mode=ticket_info&ticket_id=137035

    ATLAS
EK:
Tickets:
 lancaster ticket: suggest to close.
space reporting broken at manchesetr.
QMUL dletion errors . site in downtime.
birmingham deletion errors.
Some sites have been swithced to harvester one queue for site.
qmul in downtine.
brunel an IC have low number of jobs due to being CMS sites.

No small VO update.

GridPP DIRAC status:
    -- https://www.gridpp.ac.uk/gridpp-dirac-sam
Was a problem with the monitor so haven't run recently

###################################################################
Meetings & updates
With reference to: http://www.gridpp.ac.uk/wiki/Operations_Bulletin_Latest

    General updates
There is a call for ISGC2019 abstracts.
perfsonar recommmendation to upgrade to centos 7 4.1.2

    WLCG ops coordination
no recent update

    Tier-1 status
LFC issues last week. DB issue thought to be resolved. restoring misplaced records.
Working on othe rGGUS tickets
    Storage and data management
forward look doc progreesing should be finished by end
What to do at birmnghan has come out of PMB.

    Tier-2 Evolution
AM: WLCG infosys TF last week. ARC JSON file presented.work on VMs
AF: RAL installed version, AD not happy with this version .
AF: MAnchaetser and GLasgow willing to look at producing CE JSON .AF: sent format around.
AM need to join up UK and TF work
need to test format works with CRIC AGIS and DIRAC
https://indico.cern.ch/event/757405/
JSON schema proposal:
https://docs.google.com/document/d/1pg_5Kibc_-Z4JF4_HJyW5xL6GVYKwXxOU7DXf2QP9Ag/edit
still needs work. assumes "CEs" only have one queue.
CB copying problem double counting queue/resource again. interested at testing.

    Accounting
JC hepspec06 page:
https://www.gridpp.ac.uk/wiki/HEPSPEC06
remember to update date as well.

    Documentation
ntr from last week
SJ: voms: regarding  /etc/vomses file. lsst entries have changed. please can sites update.?
AF: Moving from FNAL TO SLAC. AF will send email with new endpoints when settled.
Need to use an rpm rather than using VOID card to update.
How to get LSST VO to update VOID card . Need to follow up.

    Security
New IGTF v1.93 now available

    Services
perfsonar 4.2 CENTOS 7 as no support for version 6. Mesh URL changed.
http://opensciencegrid.org/networking/
version 4.1 has problems.
GDPR voms web inertface . JC has an action.

    Tickets
50 Open UK tickets this week.

VOMS 137342 (23/9)
T2K noticed a voms outage last night, which Robert fixed first thing. Just checking if things are back working for them now

(and that little question of why they couldn't get proxies from the other two UK servers). Waiting for reply (24/9)
OXFORD/LHCB
136687 (13/8)
This ticket was originally created to track a side issue invloving 3rd party transfers, but I've seen no chatter on it

since the 17th of August (and that was me). Is it still relevant? In progress (17/8)
TIER 1 LFC issues fixed.
Just to point out some good progress it looks like the LFC has been fixed. Darren has kept on top of prodding the tickets

too.
TIER 1 FTS ticket
136199 (18/7)
This LHCB FTS ticket hasn't had any input in it since August, is the issue still an issue? In progress (7/8)
124876 (17/11/16)
Any news on this old ROD/ECHO/gridftp test ticket, the most ancient of tickets? It actually looks like the error message

has changed, which is something. In progress (23/7)
good progress on cambridge IPV6 ticket. QMUL have a lot due to SToRM

   

OPs meeting monday
Minutes here: https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsMeetingWeek180924
Of note:
globus-gssapi-gsi issue
    On Sep 21 globus-gssapi-gsi in EPEL was updated to version 13.10
https://twiki.cern.ch/twiki/bin/view/LCG/BestPracticesForSchedDT

Whcih versions of DPM are effected

 

###################################################################
Discussion topics
    IPv6 - status across all sites to be updated.
https://www.gridpp.ac.uk/wiki/IPv6_site_status
sites who have not updated recently:
T1,DM will update
Brunel: Seem OK
IC: no perfsonar host
QMUL:...
UCL: NOt an issue.
Durham: please update lsat reviewed date.
RALPP: CD has a meeting on thursday to review.

    The GridPP6 forward look document - final draft targeted for the end of September.
JC to go through anothe riteration.
###################################################################

Actions & AOB
No issue with VidyoConnect

###################################################################

Chat window

 

Daniela
https://ggus.eu/?mode=ticket_info&ticket_id=137352
This is Brunel: https://ggus.eu/index.php?mode=ticket_info&ticket_id=136806
DB
Matt
I will do thanks Elena.
MD
Daniela
And this is the Imperial CMS ticket: https://ggus.eu/?mode=ticket_info&ticket_id=137035
Ignore the first ticket, that was a dead raid card, but we managed to recover the files, so nothing to see here.
DB
Andrew
no uk issues from the LHCb ops meeting
AM
Chris
Could some stick the URL in the chat?
CB
Daniela
Can we please have this as something we can link to somewhere and not google docs etc. Much obliged...
DB
Alessandra
https://docs.google.com/document/d/1pg_5Kibc_-Z4JF4_HJyW5xL6GVYKwXxOU7DXf2QP9Ag/edit
AF
Andrew
This is the agenda for the WLCG task force meeting. The "minutes" attached to the first item are the JSON discussed

https://indico.cern.ch/event/757405/
AM
Jeremy
https://www.gridpp.ac.uk/wiki/HEPSPEC06
JC
Daniela
Can you not sell it to teh US as "then you have to do less work" ?
DB
Robert
It's not t2k it's snoplus
RF
Matt
Sorry, get my neutrino experiments confused.
MD
Jeremy
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsMeetingWeek180924
https://twiki.cern.ch/twiki/bin/view/LCG/BestPracticesForSchedDT
https://www.gridpp.ac.uk/wiki/IPv6_site_status
JC
Today at 11:52 AM

 

There are minutes attached to this event. Show them.
    • 11:00 11:01
      Ops meeting minutes 1m
      • This is a reminder that this is an important task. The minute taker gives access to the discussions for those not present and provides a reference for others to refer back to afterwards.

      • The team composition has been changing. If everybody contributes then the task comes around less often.

      • Please extract actions from the meeting and add them to our table here: https://www.gridpp.ac.uk/wiki/Operations_Team_Action_items#Action_list.

      • Recent allocations: See above link. The page should be updated each week by the minute taker (if they don't the task will keep coming to them!).

      • Upcoming allocations:

    • 11:01 11:20
      Experiment problems/issues 19m

      Review of weekly issues by experiment/VO

      • LHCb

      • CMS
        T1: https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL
        T2: https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T2_UK_London_Brunel

      Please see attached notes.

      • ATLAS

      • Other: Updates should be recorded in https://www.gridpp.ac.uk/wiki/GridPP_VO_Incubator.

      • GridPP DIRAC status [Andrew McNab]
        -- https://www.gridpp.ac.uk/gridpp-dirac-sam

    • 11:20 11:40
      Meetings & updates 20m

      With reference to: http://www.gridpp.ac.uk/wiki/Operations_Bulletin_Latest

      • General updates
      • WLCG ops coordination
      • Tier-1 status
      • Storage and data management
      • Tier-2 Evolution
      • Accounting
      • Documentation
      • Interoperation
      • Monitoring
      • On-duty
      • Security
      • Services
      • Tickets
      • Tools
      • VOs
      • Site updates
    • 11:40 12:20
      Discussion topics 40m
      • IPv6 - status across all sites to be updated.

      • The GridPP6 forward look document - final draft targeted for the end of September.

    • 12:20 12:25
      Actions & AOB 5m
      • Move to VidyoConnect: https://home.cern/cern-people/announcements/2018/07/video-conference-vidyoconnect-replace-current-clients