ATLAS UK Cloud Support
→
Europe/London
Zoom
Zoom
,
Description
Meeting to be held via Zoom (https://ukri.zoom.us/j/97404730356)
Password protected (same as OPs Mtg)
Outstanding tickets
- 149362 UKI-SOUTHGRID-RALPP urgent in progress 2020-11-11 16:15:00 ATLAS CE failures on UKI-SOUTHGRID-RALPP-heplnx207
- Change to ACT not helped; JW to see about switching back
- Various replies on ticket from arc devs.
- Problem persists
- 149349 UKI-SOUTHGRID-OX-HEP less urgent waiting for reply 2020-11-09 10:15:00 UKI-SOUTHGRID-OX-HEP Frontier Squid Status
- JW to close
- 148968 UKI-NORTHGRID-LANCS-HEP less urgent in progress 2020-11-11 14:48:00 UKI-NORTHGRID-LANCS-HEP: deletion and transfer failures
- Few failures on usual server. New disks now in.
- To update jira with namespace files
- JW to declare the latest files as lost
- 148342 UKI-SCOTGRID-GLASGOW less urgent in progress 2020-11-10 09:59:00 UKI-SCOTGRID-GLASGOW with transfer efficiency degraded and many failures
- JW to verify that the file can be transfered and close
- 146651 RAL-LCG2 urgent on hold 2020-10-16 11:56:00 singularity and user NS setup at RAL
- No update
- 144759 UKI-SCOTGRID-GLASGOW less urgent on hold 2020-08-10 09:54:00 High traffic from UKI-SCOTGRID-GLASGOW on RAL CVMFS Stratum1
- Ticket closed; nat5 moved
- 142329 UKI-SOUTHGRID-SUSX top priority on hold 2020-11-05 10:52:00 CentOS7 migration UKI-SOUTHGRID-SUSX
- Nodes in; awaiting networking.
- Pilots still failing
- arc can fill up logs quickly
- Input from Gareth on checking the gridFTP logs
- Matt to have a look at arc conf file
- Nodes in; awaiting networking.
CPU
New site dashboard panel of job efficiency available.
-
RAL
-
Northgrid
- LANCS; other VOs are getting lots of jobs; should have 4:1 weighting
- DPM DB falling over; killed for excess memory consumption; may not be directly DPM, but in httpd processes
- Increase of TPC-http activity could relate to this?
- LANCS; other VOs are getting lots of jobs; should have 4:1 weighting
-
London
-
SouthGrid
-
Scotgrid
- GLASGOW; 2 new CE’s being added. Half avaialble capacity currently running.
Other new issues
- Migration from AGIS to CRIC started with Switcher migration.
- Some initial problems in sites (Glasgow, Brunel) trying to get out of downtime.
- Should be resolved now
- Not all AGIS information may stay up-to-date, as CRIC becomes primary source
- Peter to check for apfmon wrt. AGIS migration
- Some initial problems in sites (Glasgow, Brunel) trying to get out of downtime.
- Glasgow LOCALGROUPDISK:
- Set up new Pool (JW to recheck atlas config will be ok)
- Should be fine for internal Glasgow users.
Ongoing issues
- CentOS7 - Sussex
- See ticket description above
- TPC
- No update
- Oxford storageless tests
- Discused in Jira and Storage mtg.
- Running HC test queue, using RAL as endpoint
- Next to set a new (arc) queue at OX
- ECDF
- No update here
News round-table
- Vip
- Needed to leave before ended.
- Asked about testing Squid
- JW to provide some examples from ATLAS
- Needed to leave before ended.
- Dan
- Needed to leave before end; NTR
- Matt
- NTR
- Peter
- Asked about AGIS migration; will follow-up for CRIC
- Sam
- NTR
- Gareth
- NTR
- JW
- NTR
- Patrick
- NTR
AOB
- NTR
There are minutes attached to this event.
Show them.