- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
https://tinyurl.com/T1-GGUS-Open
https://tinyurl.com/T1-GGUS-Closed
https://lcgwww.gridpp.rl.ac.uk/utils/availchart/
https://cms-site-readiness.web.cern.ch/cms-site-readiness/SiteReadiness/HTML/SiteReadinessReport.html#T1_UK_RAL
http://hammercloud.cern.ch/hc/app/atlas/siteoverview/?site=RAL-LCG2&startTime=2020-01-29&endTime=2020-02-06&templateType=isGolden
Images show new tables I have made in kibana/openSearch to show number of failures in last days per worker node. I added one image for a period of 15 days and another for 16 days because there seem to have been a huge number of failures on one WN 16 days ago. Looking at just the last 15 days, there is no particular problem with any one WN. A few WNs show more than 20% errror. A few show more than 50% error, but these are running relatively few jobs - possibly all SAM test jobs running on ML cores rather than multicore.
Another occurance of the DNS issue this morning (third apparent appearance in 2 weeks). However, today this could be attributed to some work being done by DI, e.g https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=477872
SAM tests are failing due to the above, but have been going green in the last hours. Likewise prod transfer efficiency is coming back.
Farm is low on capacity due to WN firewall updates.
Failure rate and efficiency of jobs is good in the last week.