Speaker
Description
The Italian National Institute for Nuclear Physics (INFN) has been operating for more than 20 years the largest scientific distributed computing infrastructure: the Tier-1 at Bologna-CNAF and the 9 Tier-2 centres provide computing and storage resources to support more than 100 scientific collaborations.
In the last years this computer infrastructure has been expanded and modernized, also profiting from the ICSC (Italian National Center on HPC, Big Data and Quantum computing) and Terabit projects, funded in the context of the Italian National Recovery and Resilience Plan.
Work was also done to integrate the different "flavors" of resources: resources tailored for high throughput computing, resources exposed through a cloud interface, HPC resources (the CINECA HPC center with its pre-exascale system “Leonardo”, but also the specialized hardware provided by the INFN “HPC bubbles”).
In this paper we present how we are leveraging RUCIO and its ancillary services to create a national datalake that can serve multiple communities with their own needs, providing seamless access to data over this distributed infrastructure.
We will present our operational experiences discussing about the federation of heterogeneous storage backends, in particular to support some small user communities. We will show how we addressed a few specific requirements such as authorization models, user registration, monitoring. Finally we’ll discuss the future plans.