5–9 May 2025
IJCLab, Paris
Europe/Paris timezone

Enhancing Job Monitoring with Power Consumption Metrics

8 May 2025, 09:36
15m
Auditorium P. Lehmann, Building 200 (IJCLab, Paris)

Auditorium P. Lehmann, Building 200

IJCLab, Paris

Domaine Universitaire Building 200 91400 Orsay
Talk Environmental sustainability Plenary

Speakers

Domenico Giordano (CERN) Natalia Diana Szczepanek (CERN)

Description

To support the sustainability of WLCG compute infrastructure, we propose a strategy to extend the current job monitoring system to include energy consumption data. Currently, WLCG monitoring systems primarily focus on traditional job metrics such as CPU time, memory usage, runtime, and failure rates.
However, they do not capture job-level power consumption, as this data is typically managed by fabric-level monitoring systems and is not directly accessible within jobs.

Our proposal is to bridge this gap by including node-level power measurements within the job data reports.
We have successfully demonstrated the feasibility of this idea with a prototype approach that uses standard WLCG grid jobs submitted to a set of pilot sites (DESY, AGLT, Glasgow).
Although this approach is valid for any kind of job payload, we have leveraged the HEPBenchmark Suite as payload, which organically captures both the performance of the node in terms of HS23 and a set of utilization metrics, including CPU load, memory usage, frequency, and power consumption.
This setup enables correlation between performance and energy efficiency metrics. It provides a more comprehensive view of job efficiency, revealing opportunities for optimization.
This is just the start of an R&D process that would need the involvement of the whole WLCG community to implement that strategy.

Presentation materials