Operations team & Sites

Europe/London
EVO - GridPP Operations team meeting

EVO - GridPP Operations team meeting

Description

- This is the weekly GridPP ops & sites meeting

- The intention is to run the meeting in VidyoConnect: https://vidyoportal.cern.ch/flex.html?roomdirect.html&key=zXhsqAxVnaT6

-- The PIN is 1234. To join via phone see http://information-technology.web.cern.ch/services/fe/howto/users-join-vidyo-meeting-phone.

-- The London (UK) service is on +442030510622.

-- The meeting extension is 109308582. PIN 1234

Chair:  Matt

Minutes:

Apologies:

GridPP Operations Team Meeting – 26th March 2019


 

Chair: Matt Doidge

Minutes: Vipul Davda

Present: Andrew McNab, Brian Davies, Dan T, Darren M, David C, Alaistair D, Elena, Emanuele, Gareth R, Gordon S, Ian L, Kashif, Me, Winnie, Pete Clarke, Raja, Rob C, Robert F, Sam S, Steve Jones, Teng and Vip Davda.

 

Apologies: Daniela, Alessandra

 

Experiment Problems/Issues


 

LHCB - (Raja)

 

  • Still fixing a few small bits from the jumbo DIRAC update of 10 days

  • Transfers problem between QMUL and CNAF (Italian Tier-1) believed solved now (GGUS:140190)

  • ARC CEs losing track of pilots in ECDF believed solved now (GGUS:140396)

  • Ongoing issue: pilots having thread issues in Glasgow (GGUS:140151)

  • Ongoing issue: LHCb migration to ECHO. Find and fixing minor bugs in testing with latest DIRAC.

 

CMS (Daniela Bauer) – Via Email

  • CMS: Brunel still has problems, Raul is working on it; other sites are fine.

  • The Imperial Phedex had a slight hiccup last night due to the disk being full, that is now fixed. Daniela submitted two tickets about file transfer issues at RAL, they are being worked on. They only affect a tiny bit of data, so the impact for the average user should be zero.

  • All other CMS sites are ok.

 

ATLAS (Elena Korolkova):

 

  • Sussex have request to disable analyse queue

  • RAL requested to disable SL6 queue

  • There was discussion of diskless sites

 

Other VOs (Daniela Bauer) – Via email

 

  • T2K (LFC to DFC): Storage issue at QMUL which is a major T2K site, need to be fixed their storage. Details in:

https://ggus.eu/?mode=ticket_info&ticket_id=138364

 

  • The three small sites (LIV, OX, SHEF) still missing and will be worked on

 

  • MICE (LFC to DFC): This is going much better (less sites, less data).

 

  • LZ changed one of their voms servers. The Operations Portal has updated now. If you support LZ, please check if:

[root@gfe02 ~]# cat /etc/grid-security/vomsdir/lz/voms.hep.wisc.edu.lsc

/DC=org/DC=incommon/C=US/ST=WI/L=Madison/O=University of Wisconsin-Madison/OU=OCIS/CN=voms.hep.wisc.edu

/C=US/O=Internet2/OU=InCommon/CN=InCommon IGTF Server CA is up to date.

Meetings and Updates

Please refer to the bulletin at http://www.gridpp.ac.uk/wiki/Operations_Bulletin_Latest for more details

 

General Updates:

 

 

  • There was a discussion on security and how to improve communication.

 

 

  • Slate:

 

 

Next Tech meeting – Steve Jones will present HTCondor CE

 

  • WLCG ops Coordination

  • Tier1 (Darren Moore): Patching the batch farm. Adding more CPUs and Storage for next year’s pledge

  • Storage and Data Management (Sam) – there was discussion on Xcache at Birmingham

  • Tier2 Evolution – no update

  • Accounting – Please update the benchmarking page

  • Documentation – Changes to VOs and major update to HTCondor CE by Steve Jones

  • Interoperation – no updates

  • Monitoring – no updates

  • On-duty – Kashif is on duty - nothing to report.

  • Roll Out: Batch system

  • Services – no update.

 

  • Security – David Crooks – There was a long discussion the latest security challenge. The challenge started on the Tuesday 12th of March but the email to the sites were not sent until 15th March Friday afternoon, this was not well received. David mentioned that it was not intentional but expected the sites to detect it well before.

    • All sites to complete the report by Friday.

 

  • Tickets – Matt Doidge: There are few Open UK tickets see Latest tickets for more details.

  • Tools – no update

  • VOs – no updates

  • AOB

GridPP42 meeting will be at RAL, please register - https://indico.cern.ch/event/780766/timetable/?view=standard

 

Group Chat

Matt

Vip is taking minutes - thanks Vip!

https://indico.cern.ch/event/803629/

(also David is recording - thanks David!)

MD

Elena

https://ggus.eu/index.php?mode=ticket_info&ticket_id=140103

https://ggus.eu/index.php?mode=ticket_info&ticket_id=140350

https://ggus.eu/index.php?mode=ticket_info&ticket_id=139723

https://ggus.eu/index.php?mode=ticket_info&ticket_id=140134

E

Daniel

i keep on finding other things todo but do need to start the move

DT

Matt

https://www.gridpp.ac.uk/wiki/Operations_Bulletin_Latest

MD

Elena

/etc/grid-security/vomsdir/lz/voms.hep.wisc.edu.lsc<br><br>(currently: /DC=org/DC=opensciencegrid/O=Open Science Grid/OU=Service/CN=voms.hep.wisc.edu<br>/DC=org/DC=cilogon/C=US/O=CILogon/CN=CILogon OSG CA 1)<br><br>with<br>/DC=org/DC=incommon/C=US/ST=WI/L=Madison/O=University of Wisconsin-Madison/OU=OCIS/CN=voms.hep.wisc.edu<br>/C=US/O=Internet2/OU=InCommon/CN=InCommon IGTF Server CA

EK

Matt

https://indico.cern.ch/event/759388/

M

Dewhurst

The joint High Energy Physics Software Foundation, Open Science Grid and Worldwide Large Hadron Collider Computing Grid 2019 Workshop

D

Elena

etc/grid-security/vomses/lz<br>should now read:/etc/vomses/lz <br>"lz" "voms.hep.wisc.edu" "15001" "/DC=org/DC=incommon/C=US/ST=WI/L=Madison/O=University of Wisconsin-Madison/OU=OCIS/CN=voms.hep.wisc.edu" "lz" "24"<br>"lz" "lzvoms.grid.hep.ph.ic.ac.uk" "15001" "/C=UK/O=eScience/OU=Imperial/L=Physics/CN=lzvoms.grid.hep.ph.ic.ac.uk" "lz" "24"

EK

David

The talk which Pete was referencing: https://indico.cern.ch/event/759388/sessions/295225/attachments/1813716/2963439/WLCGEvolutionJLAB.pdf

SLATE talk: https://indico.cern.ch/event/759388/contributions/3361774/attachments/1815564/2967154/Central_Ops_with_SLATE_and_PRP_3.pdf

https://indico.cern.ch/event/759388/sessions/295063/#20190321

(all that days sessions)

DC

Matt

https://indico.cern.ch/event/780766/<--GridPP42

MD

Elena

https://indico.cern.ch/event/770307/contributions/3301647/attachments/1807906/2951426/Deploying_Services_with_SLATE_1.pdf


 

There are minutes attached to this event. Show them.