Red days for SAM on Tuesday/Wednesday due to the continuation of network issues/interventions reported last week.
Observed large discrepancy between CMS (monit) and RAL (Vande) monitoring on running cores. This turned out to be scheduling inefficiency on CMS side with slow ramp up of scheduling agents at FNAL.
Tape downtime caused CMS to go into drain several times. The Rucio status (for Echo) was overridden by Data Management. Katy overrode the status to keep jobs running. CMS needs to treat this better and not send sites into drain just because tape is down.
A few spikes in job failures and low efficiency which may be related to network blips (long read times).
RAL FTS removed entirely from CMS-Rucio operations.