19–25 Oct 2024
Europe/Zurich timezone

Supporting the development of Machine Learning for fundamental science in a federated Cloud with the AI_INFN platform

24 Oct 2024, 15:00
18m
Room 5

Room 5

Talk Track 9 - Analysis facilities and interactive computing Parallel (Track 9)

Speaker

Matteo Barbetti (INFN CNAF)

Description

Machine Learning (ML) is driving a revolution in the way scientists design, develop, and deploy data-intensive software. However, the adoption of ML presents new challenges for the computing infrastructure, particularly in terms of provisioning and orchestrating access to hardware accelerators for development, testing, and production.
The INFN-funded project AI_INFN ("Artificial Intelligence at INFN") aims at fostering the adoption of ML techniques within INFN use cases by providing support on multiple aspects, including the provision of AI-tailored computing resources. It leverages cloud-native solutions in the context of INFN Cloud, to share hardware accelerators as effectively as possible, ensuring the diversity of the Institute’s research activities is not compromised.
In this contribution, we provide an update on the commissioning of a Kubernetes platform designed to ease the development of GPU-powered data analysis workflows and their scalability on heterogeneous, distributed computing resources, possibly federated as Virtual Kubelets with the interLink provider.
Finally we showcase the deployment of the training and validation infrastructure for the flash-simulation pipeline of the LHCb experiment, known as Lamarr, providing a practical example of how our infrastructure supports complex ML workflows in high-energy physics.

Primary authors

Lucio Anderlini (Universita e INFN, Firenze (IT)) Matteo Barbetti (INFN CNAF) Giulio Bianchini Diego Ciangottini (INFN, Perugia (IT)) Carmelo Pellegrino Rosa Petrini (INFN Sezione di Pisa, Universita' e Scuola Normale Superiore, P) Daniele Spiga (Universita e INFN, Perugia (IT))

Presentation materials

There are no materials yet.