ATLAS UK Cloud Support
Zoom
https://cern.zoom.us/j/98434450232
Password protected (same as (new) OPs Mtg)
● Outstanding tickets
-
Outstanding tickets
-
155473 TEAM atlas RAL-LCG2 less urgent NGI_UK in progress 2022-01-11 09:20:00 BU_ATLAS_Tier2 transfer and deletion errors EGI
- IPV4 connectivity issues on new webdav aliased hosts
-
155460 USER atlas UKI-SOUTHGRID-CAM-HEP less urgent NGI_UK in progress 2022-01-12 15:51:00 Failovers from Cambridge to CERN backup proxy EGI
- Active discussions from site admins
-
155430 TEAM atlas UKI-SCOTGRID-ECDF less urgent NGI_UK in progress 2022-01-12 16:37:00 UKI-SCOTGRID-ECDF transfer and deletion errors EGI
- Data at risk, due to problems over Chrsitmas
-
155141 TEAM atlas UKI-LT2-Brunel less urgent NGI_UK in progress 2021-12-24 08:39:00 Transfers from UKI-LT2-Brunel fail with “Internal Server Error” EGI
- JW to progress to a solution
-
154806 TEAM atlas UKI-LT2-QMUL less urgent NGI_UK in progress 2021-12-25 04:28:00 UKI-LT2-QMUL SOURCE transfer failures: [13] Result (Neon): SSL handshake failed EGI
- Server fell over on Christmas day
- Moving to adding more ‘oomph’, it’s not the highest priority item however
-
154543 TEAM atlas UKI-SCOTGRID-ECDF urgent NGI_UK in progress 2021-12-08 12:35:00 DPM storage ACL configuration EGI
- other urgent issues are delaying this
-
154436 TEAM atlas RAL-LCG2 very urgent NGI_UK on hold 2021-12-08 13:25:00 RAL Echo Davs developments EGI
- New webdavs endpoint with new gateways created. Available for more aggressive optimisation tuning and improvements
-
153367 TEAM atlas RAL-LCG2 urgent NGI_UK on hold 2021-12-01 15:37:00 HTTPS on RAL CTA EGI
- Needs to be tested
-
● CPU
-
-
RAL
- Remains low; some from job scheduling when there’s a large number of transfering FTS files in the queue.
- Also due to contention from other VOs
-
Northgrid
- Largely ok
-
London
- Some brief issues with QMUL
-
SouthGrid
- BHAM a few days outage (?), but back now.
- Sussex; running well, but could be running more slots at the site
-
Scotgrid
- Durham - cooling; power issue over Christmas. SRR not readable; leading to overfilling of the storage.
- Once SRR accessible, jobs started running and data reduced to below the total.
- Gla; disk controller appears to have died; Expected to be onlined shortly.
- Durham - cooling; power issue over Christmas. SRR not readable; leading to overfilling of the storage.
-
-
● Ongoing Items
-
TPC with http
- Davs optimsisation at RAL to take priority with a new webdav alias available
-
Storageless Site test (Oxford)
- Seeing TLS errors on the Xcache via xrootd; cache is passing through the data
-
LANCS Storage migration
- JW to ensure endpoint is configured in CRIC
- Site awainting one last swtich change to begin real testing
● News round-table
-
Alessandra
- NTR
-
Dan
- Storage for Atlas by end of months
- Refurbishment remains some way off
-
Gerard
- NTR
-
Matt
- NTR
-
Patrick
- NTR; Attempting to work out how to get the full number of slots to run at the site.
-
Peter
-
NTR
-
-
Sam
- GLA now restarted.
-
Stephen
- NTR
-
Vip
- To arrange a discussion to track down Xcache problems