Speaker
Description
Over the past several months, we have deployed a power accounting system across our heterogeneous WLCG Tier2 compute clusters at ScotGrid Glasgow, integrating real-time metrics from Prometheus with static power characteristics measured on our hardware. This framework dynamically allocates energy consumption to Virtual Organizations (VOs) based on actual core usage, while distinguishing the active power drawn by jobs from infrastructure overhead.
We previously demonstrated the feasibility of detailed power accounting using our Prometheus/Grafana monitoring system. In this work, we present a first analysis of data collected over several months of operation. Our study examines VO usage patterns, reveals temporal variations in resource utilization, and quantifies site efficiency metrics, such as the fraction of idle power not attributed to VO jobs. In doing so, we identify opportunities to optimize resource allocation and reduce overall power consumption in HEP computing environments.
This study provides a robust basis for advancing energy efficiency at WLCG Tier2 sites, offering actionable insights for system administrators and researchers toward more sustainable operations.
Significance
This work presents data-driven insights into energy usage and efficiency at a WLCG Tier2 site, based on months of real operation. It provides a practical model for improving site efficiency and it is in line with the WLCG sustainability goals.
References
https://indico.cern.ch/event/1450885/contributions/6252611/
Experiment context, if any | WLCG |
---|