11–15 Mar 2024
Charles B. Wang Center, Stony Brook University
US/Eastern timezone

HEP Benchmark Suite: Enhancing Efficiency and Sustainability in Worldwide LHC Computing Infrastructures

11 Mar 2024, 15:10
20m
Theatre ( Charles B. Wang Center, Stony Brook University )

Theatre

Charles B. Wang Center, Stony Brook University

100 Circle Rd, Stony Brook, NY 11794
Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Speaker

Natalia Diana Szczepanek (CERN)

Description

As the scientific community continues to push the boundaries of computing capabilities, there is a growing responsibility to address the associated energy consumption and carbon footprint. This responsibility extends to the Worldwide LHC Computing Grid (WLCG), encompassing over 170 sites in 40 countries, supporting vital computing, disk, and tape storage for LHC experiments. Ensuring efficient operational practices across these diverse sites is crucial beyond mere performance metrics.

This paper introduces the HEP Benchmark suite, an enhanced suite designed to measure computing resource performance uniformly across all WLCG sites, using HEPScore23 as performance unit. The suite expands beyond assessing only the execution speed via HEPScore23. In fact the suite incorporates metrics such as machine load, memory usage, memory swap, and notably, power consumption. Its adaptability and user-friendly interface enable comprehensive acquisition of system-related data alongside benchmarking.

Throughout 2023, this tool underwent rigorous testing across numerous WLCG sites. The focus was on studying compute job slot performance and correlating these with fabric metrics. Initial analysis unveiled the tool's efficacy in establishing a standardized model for compute resource utilization while pinpointing anomalies, often stemming from site misconfigurations.

This paper aims to elucidate the tool's functionality and present the results obtained from extensive testing. By disseminating this information, the objective is to raise awareness within the community about this probing model, fostering broader adoption and encouraging responsible computing practices that prioritize both performance and environmental impact mitigation.

Significance

High relevant for the validation and improvement of the compute performance of WLCG sites, via a centralized probing and analytic system.

Primary authors

Co-authors

Ladislav Ondris (Brno University of Technology (CZ)) Ewoud Ketele (CERN) Gonzalo Menendez Borge (CERN) Alessandro Di Girolamo (CERN) Ivan Glushkov (University of Texas at Arlington (US))

Presentation materials