ATLAS UK Cloud Support
Zoom
https://cern.zoom.us/j/98434450232
Password protected (same as (new) OPs Mtg)
● Outstanding tickets
-
-
155460 USER atlas UKI-SOUTHGRID-CAM-HEP less urgent NGI_UK assigned 2022-01-05 11:26:00 Failovers from Cambridge to CERN backup proxy EGI
- ‘Rogue’ user jobs; admins are contacting users
-
155430 TEAM atlas UKI-SCOTGRID-ECDF less urgent NGI_UK in progress 2022-01-05 22:18:00 UKI-SCOTGRID-ECDF transfer and deletion errors EGI
- Major issues with one of the storage nodes, awaiting site prognosis
-
155410 TEAM atlas RAL-LCG2 less urgent NGI_UK in progress 2022-01-04 07:07:00 RAL-LCG2 jobs failed due to transfer timeout EGI
- Large number of fts transfers queued; transfers timing out (the time in queue, not the transfer time), leading to compute job failures.
- Many mitigating steps taken: reduced number of running jobs, new webdav endpoint; dedicated atlas running on two machines on that endpoint.
- Backlog largely cleared and work ongoing for davs improvements
- 4-5 GB/s davs reads are sustainable
-
155141 TEAM atlas UKI-LT2-Brunel less urgent NGI_UK in progress 2021-12-24 08:39:00 Transfers from UKI-LT2-Brunel fail with “Internal Server Error” EGI
- Aim to resolve tickets shortly. HC test files should have been returned to site (to check); also to check on any outstanding issues
-
154806 TEAM atlas UKI-LT2-QMUL less urgent NGI_UK in progress 2021-12-25 04:28:00 UKI-LT2-QMUL SOURCE transfer failures: [13] Result (Neon): SSL handshake failed EGI
- Needs resolving (somehow)
-
154543 TEAM atlas UKI-SCOTGRID-ECDF urgent NGI_UK in progress 2021-12-08 12:35:00 DPM storage ACL configuration EGI
- To prod site on how to resolve this
-
154436 TEAM atlas RAL-LCG2 very urgent NGI_UK on hold 2021-12-08 13:25:00 RAL Echo Davs developments EGI
- Work in 155410 to help dev work
-
153367 TEAM atlas RAL-LCG2 urgent NGI_UK on hold 2021-12-01 15:37:00 HTTPS on RAL CTA EGI
- To be continued …
-
● CPU
-
-
RAL
- Poor performance, due to 155410
-
Northgrid
- Lancs Power issues on Christmas eve; resolved by / on Christmas day
-
London
- Largely ok; some QMUL blips
-
SouthGrid
- OK
-
Scotgrid
- Some issues for Durham; Glasgow CPU efficiency is low
-
● Ongoing Items
-
TPC with http
- Active work now restarting for davs deployment
-
Storageless Site test (Oxford)
- Some new Xcache harware may be avaialble for loan
- Working on finding out why little Xcache traffic
- Site will upgrade to 5.4.0
-
LANCS Storage migration
- Aliases exist; JW to configure CRIC side
● News round-table
-
Gerard
- NTR
-
Matt
- NTR
-
Patrick
- NTR
-
Peter
- NTR
-
Stephen
- NTR
-
Vip
- NTR