● Outstanding tickets
- 150820 UKI-LT2-RHUL less urgent in progress 2021-03-03 15:18:00 UKI-LT2-RHUL: 0% Transfer and deletion efficiencies
- Power incident in center. Possible problems in bringing up some hardware.
- 150775 UKI-SCOTGRID-GLASGOW less urgent in progress 2021-03-04 04:45:00 UKI-SCOTGRID-GLASGOW transfer and deletion errors
- External link going down to 1 Gb
- Many routes of campus; might be cause of issue
- Association to large numbers of small files attempted to be transfered in via FTS
- Restart of gridFTP services ‘cures’ problem
- 149362 UKI-SOUTHGRID-RALPP urgent in progress 2021-02-18 20:00:00 ATLAS CE failures on UKI-SOUTHGRID-RALPP-heplnx207
- 146651 RAL-LCG2 urgent on hold 2021-02-16 17:37:00 singularity and user NS setup at RAL
- 142329 UKI-SOUTHGRID-SUSX top priority on hold 2021-01-20 20:29:00 CentOS7 migration UKI-SOUTHGRID-SUSX
- Issue with arex, due to change of hypervisor
- Networking should be ok now.
- Potential issues for
- A few (3/4) different hardware sets; Monitor each set as provisioned.
- Patrick to update ticket once confirming network
● CPU
-
General problem with non-aCT sites, due to Harvester issue on Sunday evening, casuing worldwide job reductions
-
Weds. another dip in central production; presumed due to Condor update (for ARC / GDPR fixes)
-
RAL
- Job reductions due to:
- Capped to 100% to enable more jobs for LHCb
- ATLAS demands for more SCORE user jobs;
-
Northgrid
- Lancs: Fairshare (stable to O(3k)), may want to try and bounce other users.
-
London
-
SouthGrid
-
Scotgrid
-
BHAM also not running for LHCb, No update.
-
CAM to go into long downtime; long-term future to be discussed.
● Ongoing Items
-
CentOS7 - Sussex
- Making good progress now (see abouve )
-
TPC with http
- Moving to WebDav for wan / lan / TPC.
- LHCb enabled everywhere, except for RAL, GLA, QMUL(?)
- CMS: Similar RAL, GLA(~ no local storage), QMUL (use IC)
- DPM / xrootd; different libCurl versions
- Push for xrootD sites now
- Discussion on tokens occured
- Users still tend to complain on not liking the grid.
- GLA DirectIO possibility if WAN xrootd enabled. Needs a Cache inplace.
- LAN - just to test that LAN works
- WAN to be main focus
- QMUL - prefer to wait (should already work, but server not quite powerful enough) for new hardware.
- gridFTP, one main server that redirects for the actual transfers
- WebDav, might need some configuration to do the same as gridFTP. One powerful node should however work fine.
-
Storageless Site test / storage decomissioning (Oxford)
- Sam to get to Vip today updated config for Rules refinment.
- ECDF ES xrootd cache monitoring up, but not seeing Xcache transfers.
-
ECDF volatile storage
- Rob reconfigured the site
- JW to make the necessary ATLAS updates
-
Glasgow DPM Decommissioning
- LOCALGROUPDISK done; Datadisk residual data remain
-
ATLAS: Site Availability/Reliability reports: Glasgow
● News round-table
-
Vip
-
Dan
-
Matt
- Disk server reboot - not come up;
- Couple of TB data
- 11k files might be declared lost
-
Peter
- (Needed to leave before end)
-
Alessandra
-
Sam
-
Gareth
- Will be leaving GridPP in April
-
JW
-
Patrick
-
Rob
There are minutes attached to this event.
Show them.