● Outstanding tickets
- 150308 UKI-SCOTGRID-DURHAM less urgent in progress 2021-02-25 09:32:00 Jobs at UKI-SCOTGRID-DURHAM_SL7_UCORE fail with “Server error: no such file or directory”
- Additional files missing and declared lost. Ticket now closed
- 149842 UKI-SCOTGRID-ECDF very urgent waiting for reply 2021-02-22 11:55:00 UKI-SCOTGRID-ECDF: Low transfer efficiency due to TRANSFER ERROR: Copy failed with mode 3rd pull, wi…
- Macaroons issue. patch in apache config failed to apply.
- On resolving this, situation looks much improved. Await some time to monitor, then site can hopefully close the ticket.
- 149362 UKI-SOUTHGRID-RALPP urgent in progress 2021-02-18 20:00:00 ATLAS CE failures on UKI-SOUTHGRID-RALPP-heplnx207
- Remains stalled. Some discussion in the Round-table below; needs detailed site diganosis to understand more.
- 146651 RAL-LCG2 urgent on hold 2021-02-16 17:37:00 singularity and user NS setup at RAL
- Remains on hold while pre-steps are completed.
- 142329 UKI-SOUTHGRID-SUSX top priority on hold 2021-01-20 20:29:00 CentOS7 migration UKI-SOUTHGRID-SUSX
- 2 network swtiches faulty; to be replaced shortly. Aim to get remaining nodes updated after that.
● CPU
- RAL
- High numbers of jobs for RAL, as LHCb running low
- Northgrid
- Sheffield offline for Downtime, but slowly coming back.
- Lancs: best two weeks for some time.
- London
- QMUL; Residual Data transfer issues for a few sites.
- SouthGrid
- BHAM and CAM remain offline - VAC problems?
- Scotgrid
- Durham Argus problems resulted outage.
● Ongoing Items
- CentOS7 - Sussex
- TPC with http
- Alessandra to aim to move latest set of UK T2 to http today.
- Storageless Site test / storage decomissioning (Oxford)
- Aim to complete / test Xcache today. If successful move towards ATLAS configuration and testing.
- ECDF volatile storage
- Jira updated; requires site to reconfigure the new DPM to have a atlasvolatiledisk, rather than the atlasdatadisk as currently envisiged.
- Glasgow DPM Decommissioning
- Awaiting feedback from DDM ops
- ATLAS: Site Availability/Reliability reports: Glasgow
- Alessandra hopes to update ticket if it can be progressed.
● News round-table
-
Vip
- Noted that GocDB search brings up the pre-prod instance (which is out-of-date, and has no warning that it is pre-production).
-
Dan
-
Matt
- last 2 weeks, very good.
- Some updates to storages, e.g. to set read-only old servers,
- dpm settled down, with no specific overloaded servers.
- Running largely Full simulation at the moment.
- Worries with new servers; e.g. overloading. How to preload?
-
Peter
- Discussion on how to proceed with RALPP issue (above):
- Possible for RTE / puppet, interactions;
- Gareth suggests that submission must have made it to the batch farm;
-
Alessandra
- to update next tranche of UK sites to http
-
Sam
- To present Xcache activity to PMB in near meeting.
-
Gareth
-
JW
-
Duncan
-
Patrick
-
Rob
- transparent Xcache working at ECDF and reducing numbers of connections
There are minutes attached to this event.
Show them.