AGLT2 had its storage blacklisted for 3-days, even though the original problem was just a brief glitch introduced by our VMware migration/upgrade.   This prevented our site from being put back online by HammerCloud till the blacklisting was removed.

On the positive side we managed to finally upgrade our VMware infrastructure from v5.5 running on old R630 nodes to v6.7 running on new R740 hardware.   Still lots of tuning to do but services are running much better now.

Lots of cabling work ongoing as well, including correcting and updating labels, port descriptions in switches, socket descriptions on PDUs and the corresponding VISIO diagrams.

New hardware (9 C6420 servers at UM) is cabled and ready to be built soon.

Keep seeing high load condor work nodes, 2-3 nodes are being killed every day due to high load(>100 per core). This might be caused by specific jobs, usually OSG/CMS jobs. 

HTCondor head node(a virtual machine) was out of reach for a few hours during the vmware update, but it did not affect the running jobs. 

dcache head node is upgraded from 4.2.21 to 4.2.23, to fix the gplazma authentication bugs (the authentication would fail every a couple of days). The other pool/door nodes still run on 4.2.21.