The SAM tests are making our site availability red for a few days this week. I believe this is down to a combination of 2 factors:

1. Sometimes the tests on the arc-ces are not reported so there is a blank for a couple of slots, and sometimes several of these happen at the same time. If another arc-ce happens to fail at the time, the overall SAM status will be red. If all the tests were running as normal and some of them were green the overall status would be green since it only takes one arc-ce to be green in any one time slot for a green overall status. There is a perhaps related ticket describing the missing tests: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150482

2. There may be a problem with the RAL-based redirector xrootd-cms-uk.gridpp.. I saw that the current log file was growing to more than 20GB for less than 24 hours running. The machine was running out of space. I cleaned it up but I will need to do this regularly with the level of logging, or vastly decrease the time logs can live on the machine. Concerning the amount of logged output, the theory at the moment is that when a request comes in, this is generally satisfied right away. However, the log continues with repeats of the 'do_have' query seemingly after a successful transfer. I chose one file at random and in the log I was checking (16th Feb) there were 10k references to that file. The proxy log contains ~8 requests (almost all from different sites) during the same period, all of which appear successful. 

Last week I completed the clean-up from the consistency check on the tape. I deleted 111k files which may have failed to delete last summer (July 2020?).