Speaker
Description
The rapid growth in data centre energy demand poses significant challenges for the sustainability of large-scale scientific computing. In alignment with CERN and WLCG strategies on environmentally responsible computing, this work investigates methods to reduce energy consumption, electricity costs, and CO₂ emissions at the PIC WLCG Tier-1 site through energy-aware compute resource modulation.
Three complementary studies are presented. First, simulated natural job drainages were applied to real HTCondor utilisation data from 2023–2024 to assess the impact of temporarily halting job acceptance during periods of high electricity prices or carbon intensity. While this approach achieved limited economic and environmental gains, it resulted in disproportionate computational losses, primarily due to non-energy-aware scheduling, hardware heterogeneity, hyperthreading effects, and long job runtimes. These results highlight the limitations of naïve drainage strategies.
Second, dedicated experiments were conducted to evaluate the impact of dynamically adjusting CPU clock frequencies across PIC compute nodes. The study quantifies the relationship between CPU frequency, delivered compute performance, power consumption, and energy efficiency, demonstrating that frequency scaling can offer meaningful reductions in power draw and operational costs with controlled performance degradation. This enables finer-grained, node-level modulation of the compute farm compared to natural drainage strategies.
Finally, an XGBoost-based machine learning model was developed to predict CPU core availability following real-time drainage decisions using only information available at decision time. Trained on two years of site-specific HTCondor data, the model accurately forecasts core reductions, particularly in the 8–40 hour window after a drainage event, enabling proactive and informed resource management.
Together, these results provide actionable insights and practical tools for implementing energy-aware scheduling and control at PIC, and offer a scalable framework applicable to other WLCG sites pursuing sustainable computing operations.