Indico celebrates its 20th anniversary! Check our blog post for more information!

23–28 Oct 2022
Villa Romanazzi Carducci, Bari, Italy
Europe/Rome timezone

Transparent extension of INFN-T1 with heterogeneous computing architectures

24 Oct 2022, 11:00
30m
Area Poster (Floor -1) (Villa Romanazzi)

Area Poster (Floor -1)

Villa Romanazzi

Poster Track 1: Computing Technology for Physics Research Poster session with coffee break

Speaker

Stefano Dal Pra (Universita e INFN, Bologna (IT))

Description

The INFN-CNAF Tier-1 is engaged for years in a continuous effort to integrate its computing centre with more tipologies of computing resources. In particular, the challenge of providing opportunistic access to nonstandard CPU architectures, such as PowerPC or hardware accelerators (GPUs) has been actively exploited. In this work, we describe a solution to transparently integrate access to ppc64 CPUs as also GPUs. This solution has been tested to transparently extend the INFN-T1 Grid computing centre with Power9 based machines and V100 GPUs from the Marconi 100 HPC cluster managed by CINECA. We also discuss further possible improvements and how this will meet requirements and future plans for the new tecnopolo centre, where the CNAF Tier-1 will be hosted soon.

References

1) Boccali, T., Dal Pra, S., Spiga, D., Ciangottini, D., Zani, S., Bozzi, C., ... & Bonacorsi, D. (2020). Extension of the INFN Tier-1 on a HPC system. In EPJ Web of Conferences (Vol. 245, p. 09009). EDP Sciences.

2) Enabling CMS Experiment to the utilization of multiple hardware architectures -- a Power9 Testbed at CINECA (ACAT 2021)

Significance

End users can transparently access HPC resources and special resources (non x86 CPUs, GPUs) through the usual and well known methods used to submit payloads to the INFN-T1 batch system. No need for the INFN-T1 users to adapt their submission workflow in case of particular targets. Also no need for them to directly handle specific problems at the resources, who are managed by the INFN-T1 staff.

Experiment context, if any The context of this research is provided by several WLCG experiments. The shared use case is the needs to access any available resource, minimizing the effort Operational Wise as well as minimizing the development effort required to integrate new heterogeneous providers.

Primary authors

Daniele Spiga Stefano Dal Pra (Universita e INFN, Bologna (IT))

Co-authors

Lorenzo Rinaldi (Universita e INFN, Bologna (IT)) Dr Tommaso Boccali (INFN Sezione di Pisa)

Presentation materials

Peer reviewing

Paper