2023 equipment purchase:
Added 11x R6525 with AMD 7443 (96 HT/node)
measured HEPscore= 17.48 HS06= 17.75
total added: 1056 cores, 18.5k HEPscore / 18.7k HS06
note: new PERC H355 not supported by OMSA on CentOS7
Retired 32x 24HT R410s (768 cores / 6.6k HS06) 9.84
Retired 24x 32HT R620s (736 cores / 8.1k HS06) 11.03
Retired 5x 40HT R620s (200 cores / 2.1k HS06) 10.68
(retirements above may not be exact, will correct after meeting)
Total Change expected about neutral
Also added 7x of 12x R740xd2 at UM
(each with 24x 20TB disks= 370 TB usable in dcache)
4x R740xd2 at MSU waiting on network config (promised for tonight)
The other 5x R740xd2 at UM to compare RAID6 to JBOD/raidz3
Retired 4x MD3xxx shelves of 8T disks (~1.4 PB)
Waiting on MSU to be deployed before sitewide space re-balance
Did NOT YET increase dcache advertised space
Ultimately the grand total change will be about +4.5 PB
Events:
10-May: ZFS problem on NVMe mirror holding dcache database (Ticket 161890).
After file system recovery one postgres file remained flagged as possibly corrupted.
Recovered from backup/mirroring node.
18-May: MSU Data Center heats up during regular/yearly Fire Alarm testing.
All newer/hotter Worker Nodes (C6420s and R6525s) shut themselves down.
Unexpected. Could be operator error but no official report yet.
For fun/curiosity: coarse comparison HS06 vs HEPscore (as of May 2023)
| | 6132 | 6240R | 7302 | 7413 | 7443 || | 2.60GHz | 2.40GHz | 3.00GHz | 2.65GHz | 2.85GHz || Tot HT | 56 | 96 | 64 | 96 | 96 ||----------+---------+---------+---------+---------+---------|| HS06/HT | 13.64 | 10.94 | 16.42 | 17.28 | 17.75 || HEPS/HT | 13.16 | 12.22 | 16.17 | 16.97 | 17.48 ||----------+---------+---------+---------+---------+---------|
note: HEPscore measured as average of 2 runs on only 1 node each (except 7443 with 2 runs on 5 nodes)HS06 taken from US facility spreadsheet for AGLT2.