Operations team & Sites

Europe/London
EVO - GridPP Operations team meeting

EVO - GridPP Operations team meeting

Description

- This is the weekly GridPP ops & sites meeting

- The intention is to run the meeting in VidyoConnect: https://vidyoportal.cern.ch/flex.html?roomdirect.html&key=zXhsqAxVnaT6

-- The PIN is 1234. To join via phone see http://information-technology.web.cern.ch/services/fe/howto/users-join-vidyo-meeting-phone.

-- The London (UK) service is on +442030510622.

-- The meeting extension is 109308582. PIN 1234

Chair:  Matt

Minutes:

Apologies:

Videoconference Rooms
GridPP-Operations
Name
GridPP-Operations
Description
- This is the weekly GridPP ops & sites meeting - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the Janet(UK) Community area. Direct link http://evo.caltech.edu/evoNext/koala.jnlp?meeting=MDMaM82v2nD2Du999sD99D - The phone bridge number is +44 (0)161 306 6802. The phone bridge ID is 1001002 with code: 4880. Apologies:
Extension
109308582
Owner
Alessandra Forti
Auto-join URL
Useful links
Phone numbers

Minutes 30/4/2019
=================

Present:
========
* Alessandra Forti
* Andrew McNab
* Brian Davies
* Chris Brew
* Daniel Traynor
* Daniela Bauer
* Darren Moore
* David Crooks
* Elena Korolkova
* Emanuele Simili
* Gareth Roy
* Gordon Stewart
* Ian Loader
* John Hill
* Kashif Mohammad
* Linda Cornwall
* Matt Doidge
* Paige Lacesso
* Pete Clarke
* Pete Gronbech
* Raja Nandakumar
* Robert Frank
* Sam Skipsey
* Steve Jones
* Teng

LHCB Report
===========
* Problems with aborted Pilots at ECDF
* Another storage pool moved from CASTOR to ECHO at RAL, this leaves only the USER space to move

CMS
===
* Brunel has data transfer problems, nothing else to report for CMS>

ATLAS
=====
* Brunel having problem with file deletions for ATLAS. BD says this is an issue with WebDAV being unstable.
* RHUL has an issue with Squid servers being shown as red when on of the HA servers is available


VO Incubator
============
* Discussion of DUNE queues at Lancaster, jobs being submitted to old SL6 queue. MD to email AMcN to have queues moved over.
* LFC to DFC migration for T2K and Mice ongoing, progress being made by DB and SF


Meetings and Update
===================


General Updates
---------------
* Discussion of GridPP42 and associated presentations.
* Consultation on GridPP43 potential dates and locations (nominally located at Ambleside 20-22nd August).
    - Problems were raised in respect to the timing being close to the end of the summer holidays.
    - PC encouraged emails to DB to raise any questions/concerns.

* Technical meeting regarding DPM and future within the UK.
    - AF comments that many sites are planning to move from DPM (whether as smaller sites or moves to different storage solutions).
    - KM comments that Oxfords plan is to update to DOME.

* HEPSYSMAN
    - Registration now appears to be open for HEPSYSMAN.
    - DC asks if anyone has security topics to cover as part of the training if the could let him know.

* TB-SUPPORT discussion about SW areas and whether or not they are needed. It appears that now all VOs are either using CMVFS or containers so it is unlikely that a SW area is needed.


Tier-1 Status
-------------
* High CMS failures seem but this appears to have improved.


Storage
-------
* no report


Tier-2 Evo
----------
* no report


Accouting
=========
* no report


Documentation
=============
* SJ will check Fermilab VO information


Interop
=======
* no report


Monitoring
==========
* no report


On Duty
=======
* DB on duty, nothing to report


Security
========
* Nothing ongoing that sites need to be concerned about.
* Dockerhub breach for sites that may have been using this for any reason.
* Trust anchors have been updated, sites should install as soon as they are able.


Services
========
* no report


Tickets
=======
* 131608 - Needs an update as being escalated to VO Manager
* 139101 - no news
* 140679 - ongoing


Discussion & AOB
================
* Site Round Table

Manchester     - going into downtime to upgrade DPM headnode
RALPP        - nothing to report
QMUL        - nothing to report
IC        - preparing for the move to Slough and IRIS cloud
Sheffield    - nothing to report
Glasgow        - CEPH and HTCondor-CE
Cambridge    - nothing to report
Oxford        - SL6 CE now in downtime to be retired (CentOS7 only)
Lancaster    - Robin leaving Lancaster
Bristol        - Condor and OS upgrades.
Liverpool    - Attending EGI to talk about CREAM-CE migration
Edinburgh    - working on a second CE as current is having problems


* HEPSYSMAN - registration open please register if you'd like to attend.

* GridPP6 - PC updates on current status, panel to take place next week.
* PC congratulates GridPP on it's ability to aid LSST and getting it's payloads running via GridPPs infrastructure.
* Indigo IAM vs EGI Check-In as a SSO solution, both AF and DC felt that IAM was a better solution then the EGI Check-In system.

 

Chat
====

Matt Doidge: (30/04/2019 11:03)

Gareth is kindly taking minutes

Elena Korolkova: (11:06 AM)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=140848

https://ggus.eu/index.php?mode=ticket_info&ticket_id=140890

Matt Doidge: (11:12 AM)

https://www.gridpp.ac.uk/wiki/Operations_Bulletin_Latest

David Crooks: (11:31 AM)

https://success.docker.com/article/docker-hub-user-notification

Raja: (11:33 AM)

Apologies - got to go now

Paige Winslowe Lacesso: (11:33 AM)

apologies back soon

John Hill: (11:43 AM)

https://wiki.egi.eu/wiki/PROC12

David Crooks: (11:43 AM)

I was just about to post that as well :)

Alessandra: (11:51 AM)
>14k

 

There are minutes attached to this event. Show them.
    • 11:00 11:01
      Ops meeting minutes 1m
      • This is a reminder that this is an important task. The minute taker gives access to the discussions for those not present and provides a reference for others to refer back to afterwards.

      • The team composition has been changing. If everybody contributes then the task comes around less often.

      • Please extract actions from the meeting and add them to our table here: https://www.gridpp.ac.uk/wiki/Operations_Team_Action_items#Action_list.

      • Recent allocations: See above link. The page should be updated each week by the minute taker (if they don't the task will keep coming to them!).

      • Upcoming allocations:

    • 11:01 11:20
      Experiment problems/issues 19m

      Review of weekly issues by experiment/VO

      • LHCb

      • CMS
        T1: https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL
        T2: https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T2_UK_London_Brunel

      • ATLAS

      • Other: Updates should be recorded in https://www.gridpp.ac.uk/wiki/GridPP_VO_Incubator.

      Also from Daniela:
      *T2K (LFC to DFC): We really really need QMUL which is a major T2K site to deal with their storage. Details in:
      https://ggus.eu/?mode=ticket_info&ticket_id=138364
      We haven't quite got round to the three small sites (LIV, OX, SHEF) still missing (because I spend all my time setting up an IRIS cloud), but we haven't forgotten.

      *MICE (LFC to DFC): This is going much better (less sites, less data).

      *LZ changed one of their voms servers. The Operations Portal has updated now. If you support LZ, please check if:
      [root@gfe02 ~]# cat /etc/grid-security/vomsdir/lz/voms.hep.wisc.edu.lsc
      /DC=org/DC=incommon/C=US/ST=WI/L=Madison/O=University of Wisconsin-Madison/OU=OCIS/CN=voms.hep.wisc.edu
      /C=US/O=Internet2/OU=InCommon/CN=InCommon IGTF Server CA
      is up to date.

      • GridPP DIRAC status [Andrew McNab]
        -- https://www.gridpp.ac.uk/gridpp-dirac-sam
    • 11:20 11:40
      Meetings & updates 20m

      With reference to: http://www.gridpp.ac.uk/wiki/Operations_Bulletin_Latest

      • General updates
      • WLCG ops coordination
      • Tier-1 status
      • Storage and data management
      • Tier-2 Evolution
      • Accounting
      • Documentation
      • Interoperation
      • Monitoring
      • On-duty
      • Security
      • Services
      • Tickets
      • Tools
      • VOs
      • Site updates
    • 11:40 12:20
      Discussion topics 40m
      • February GDB: https://indico.cern.ch/event/739875/
      • Site roundtable.
    • 12:20 12:25
      Actions & AOB 5m
Your browser is out of date!

If you are using Internet Explorer, please use Firefox, Chrome or Edge instead.

Otherwise, please update your browser to the latest version to use Indico without problems.

×