19–25 Oct 2024
Europe/Zurich timezone

Exploiting GPU Resources at VEGA for CMS Software Validation

22 Oct 2024, 14:24
18m
Large Hall B

Large Hall B

Talk Track 7 - Computing Infrastructure Parallel (Track 7)

Speaker

Daniele Spiga (Universita e INFN, Perugia (IT))

Description

In recent years, the CMS experiment has expanded the usage of HPC systems for data processing and simulation activities. These resources significantly extend the conventional pledged Grid compute capacity. Within the EuroHPC program, CMS applied for a "Benchmark Access" grant at VEGA in Slovenia, an HPC centre that is being used very successfully by the ATLAS experiment. For CMS, VEGA was integrated transparently as a sub-site extension to the Italian Tier-1 site at CNAF. In that first approach, only CPU resources were used, while all storage access was handled via CNAF through the network. Extending Grid sites with HPC resources was an established concept for CMS, however, in this project, HPC resources located in a different country from the Grid site were first integrated. CMS used the allocation primarily to validate a recent CMSSW release regarding its readiness for GPU usage. Former developments in the CMS workload management system that allow the targeting of GPU resources in the distributed infrastructure turned out to be instrumental and jobs could be submitted like any other release validation workflow. The presentation will detail aspects of the actual integration, some required tuning to achieve reasonable GPU utilisation, and an assessment of operational parameters like error rates compared to traditional Grid sites.

Primary authors

Adriano Di Florio (CC-IN2P3) Andrea Bocci (CERN) Antonio Perez-Calero Yzquierdo (Centro de Investigaciones Energéticas Medioambientales y Tecnológicas) Christoph Wissing (Deutsches Elektronen-Synchrotron (DE)) Daniele Spiga (Universita e INFN, Perugia (IT)) Jose Hernandez (CIEMAT)

Presentation materials