Speaker
Description
We present a pragmatic study of energy-management strategies in a WLCG Tier-2 environment. Building on prior node-level benchmarking (HS23/Watt) and IPMI-based telemetry, we deployed coordinated CPU frequency modulation across the few hundred physical servers at ScotGrid Glasgow and measured cluster-level effects under controlled operating conditions.
Scaling CPU frequency to a mid-range value is a proven strategy to improve HS23/Watt in benchmarks, but it has not been widely tested in production. Our tests compare baseline operation with underclocked regimes and quantify the energy saved in the IT layer alongside impacts on job latency and throughput.
Using available real-time CO2 intensity data we compute net CO2 savings at cluster scale for representative production workloads. We describe our methodology for data collection and validation, explain how cluster operation can be aligned with temporal variations in grid carbon intensity, and offer practical, actionable recommendations and a reusable measurement framework that other sites can adopt.