Taking advantage of the fact that all MSU WN were powered off for BPS building circuit work, we started the Meltdown/Spectre rpm updates there around January 5. All WN at both sites have now been updated with this fix, along with all gatekeepers, desktops, interactive machines and dCache pool servers. For the latter, we will be examining some kernel parameter changes that would hopefully gain back performance lost due to the kernel and other rpm updates.
See: https://community.centminmod.com/threads/linux-kernel-security-updates-for-spectre-meltdown-vulnerabilities.13648/
This says to add to the kernel line:
noibrs noibpb nopti
HS06 runs indicate less than a 1% decrease in performance on modern processors from the kernel updates. This decrease seems to be more than offset by an increase in performance when the same machine is updated to SL7. Older processors seem to either be unchanged, or perform slightly better on HS06 with the kernel updates.
As John Hover points out though, IO will be the real sticking point. We are trying to obtain some data from muon calibration runs on this issue, but it is not yet available. The jobs consist of running Athena to convert a calibstream fragment to a calib ntuple. Results when available (later today, or perhaps tomorrow) will be posted to the usatlas-t2-l list when they are ready.
We now have a small SL7 gatekeeper/cluster running at AGLT2 (~100 cores), and have created an SCORE production queue (AGLT2_SL7) for testing. As of this writing, we have seen only a few software jobs (nagrun.sh -v..., about 130 jobs like this in 3 days time), and do not otherwise appear to be getting many pilots. We will follow up on this.
Otherwise operation has been smooth.