ATLAS UK Cloud Support

Europe/London
Vidyo

Vidyo

Tim Adye (Science and Technology Facilities Council STFC (GB)), James William Walder (Science and Technology Facilities Council STFC (GB))

● Outstanding tickets

  • 148968 UKI-NORTHGRID-LANCS-HEP less urgent in progress 2020-10-14 19:49:00 UKI-NORTHGRID-LANCS-HEP: deletion and transfer failures
  • 148342 UKI-SCOTGRID-GLASGOW less urgent in progress 2020-10-09 11:53:00 UKI-SCOTGRID-GLASGOW with transfer efficiency degraded and many failures
    • No route to Host transfer errors for DPM storage. To be investigated
  • 146651 RAL-LCG2 urgent on hold 2020-08-10 10:59:00 singularity and user NS setup at RAL
    • On hold
  • 146374 UKI-NORTHGRID-SHEF-HEP urgent in progress 2020-09-11 13:35:00 ATLAS pilot jobs idle on UKI-NORTHGRID-SHEF-HEP CE
    • On hold
  • 144759 UKI-SCOTGRID-GLASGOW less urgent on hold 2020-08-10 09:54:00 High traffic from UKI-SCOTGRID-GLASGOW on RAL CVMFS Stratum1
    • On hold
  • 142329 UKI-SOUTHGRID-SUSX top priority on hold 2020-06-04 14:05:00 CentOS7 migration UKI-SOUTHGRID-SUSX
    • On hold

● CPU

  • RAL

    • Additional slots from CMS issues
  • Northgrid

    • LANCS: Set offline from disk issues
  • London

    • QMUL; transient issue.
  • SouthGrid

    • Some RALPP fluctuations
  • Scotgrid

    • GLA onlining more CPUs. New Dell nodes. Last two had cvmfs cache issues requiring a manual fix.
      • missing sub-dirs
      • OX noted on some of their nodes, cvmfs is getting full, and can result in blacklisting
    • Durham; DPM disk server poorly; DPM fills up logs. Should be resolved by now.

● Ongoing issues

  • CentOS7 - Sussex

    • NTR

  • Grand Unified queues

    • NTR


● News round-table

  • Vip
    • No downtime next week; few WNs will be offlined for work however.
  • Dan
    • NTR
  • Matt
    • Will give “T2 operations in Covid” in GridPP45.
  • Peter
    • Noted general poor audio; not observed from others.
    • If continues next week, we consider move to zoom (again).
  • Sam
    • cephc05 as production machine runing fine. c02 for dev work to be updated with forked xrootd-ceph shortly
    • Next week Storage meeting will be cancelled for GridPP overlap
  • Tim
    • NTR
  • JW
    • Work on TPC-http with Ceph continues; new problem with stripe alignement.

 

There are minutes attached to this event. Show them.