Pre-trained representations from large volumes of unlabelled data, known as foundation models, are rapidly emerging as a key AI technology in scientific research. I will review some of the recent applications of these models in astrophysics, highlight their advantages, and some of the potential challenges. I will also describe our recent work looking at recovering calibrated uncertainties from...
EISCAT has operated high power large aperture radars for upper atmophere and near-Earth space studies since 1981, and has so far collected a dataset of about 100 TB.
At present, EISCAT is in the process of deploying EISCAT_3D, EISCAT's next generation imaging radar. EISCAT_3D data volumes and processing requirements will be more similar to those of high energy physics and radio astronomy, and...
Recently the UK government has published a variety of computing strategy documents to take advantage of artificial intelligence and the way it is transforming the way we do research. New large scale facilities are being built to house AI super computers while other data centres will provide complimentary services. This talk will cover the high level UK strategy as well as the approach the...
In this presentation, we will explore the Rucio experience with the Rubin Observatory experiment. Our discussion will cover several key areas:
Scalability Tests: Insights into the performance and scalability evaluations of Rucio in the context of Rubin's data needs and what we have learned, especially with many small files.
Role in Rubin's Data Curation: Rubin's Data Butler: An overview of...
CERN IT is extending Rucio to support Small and Medium Experiments (SMEs) through a centrally managed, ready-to-use Rucio data management service. Leveraging best practices from large-scale deployments, the service offers reproducible, one-click setups with SME-specific enhancements. In this session, we will present early pilot projects, share lessons learned, and demonstrate how SMEs can...
We present how the XENONnT experiment handles data with the help of Rucio as data management tool. We focus mainly on the way how we distribute data and the strategies adopted to fix transfer issues. Finally, we conclude with some remarks on the needs for next generation experiments.
The KM3NeT collaboration is building a neutrino telescope in the Mediterranean sea, to study both the intrinsic properties of neutrinos and cosmic high energy neutrino sources. Once fully constructed, our computing needs will rise to an eventual data volume of ~500TB of new data per year, and computing needs of ~2000 cores on average. This will require a transition towards distributed...
DaFab AI leverages Rucio to bridge the gap between EO mission realities and advanced analytics. This session breaks down Rucio’s metadata evolution, from fixed metadata columns and “key:value” attributes, to a schema-governed catalog.
We are extending Rucio with native support for open data to better serve interdisciplinary research an sharing and re-use of results. The current approaches to open data require costly data duplication to comply with FAIR principles. The work we have been doing since the last workshop is to integrate open data support natively into Rucio. In this session, we will present the development...
This contribution presents the current status of DIRAC and DiracX, with a focus on the following topics:
- how DIRAC interfaces to Rucio
- how DiracX will interface to Rucio
- integration testing
This presentation will introduce the Rucio WebUI with a focus on helping communities get up and running quickly. We will also showcase the key features currently available and the upcoming developments, offering insight into how the WebUI will continue to evolve to support the Rucio community.
CERN IT’s Rucio-as-a-Service for Small and Medium Experiments (SMEs) introduces a modern infrastructure based on ArgoCD and Kubernetes, Vault, automatic DNS manipulation, etc ... moving beyond traditional Flux-based deployments seen in the community. This tutorial will demonstrate how new Rucio clusters can be created in minutes and explain why this approach should become the default for all...
This presentation gives an overview on the CTAO use case for intended and potential usage of Rucio in the operational and data management lifecycle.
This talk will be about the deployment of WLCG IAM at CERN, focusing on aspects like architecture, high availability, monitoring, user synchronization, etc. It will also include a view from the developers of upcoming/important changes.
The transition from X.509 certificates to OAuth 2.0 tokens is an ongoing effort attracting universal interest. This talk aims to offer a status update since the previous Rucio workshop and outline the expected short- and medium-term developments.
This talk presents findings from recent test campaigns within SRCNet v0.1, focusing on how Rucio was exercised across realistic science workflows. These results highlight emerging challenges and opportunities, prompting key questions around future policy decisions—such as data lifecycle rules, access control, and science artefact mapping—that will shape the evolution of data management practices.
This talk explores the deployment of Rucio across SRCNet for SKA data, highlighting infrastructure choices, deployment, integration environments and developments with cloud providers.
The Belle II experiment at the SuperKEKB collider in Japan is a next-generation B-factory with a large international collaboration and demanding computing needs. Belle II has been using Rucio as its data management system since early 2021, supporting global distribution and access to physics data. The experiment is now transitioning to Rucio as its primary metadata service, tightly integrating...
On this talk we will give an update about the Rucio/SENSE Integration Project. We will talk about the production-ready site deployments, new developments to simplify enabling support for SENSE from the site's point of view, and the next steps towards having the first End-to-End-Rucio-SENSE workflow in production.
This contribution provides updates and news about the EuroHPC initiative.
This contribution focuses on the recent updates to the Jupyter extension, its use throughout the community, and future plans.
A summary of the data access and sharing problem raised by the HEP user community.
Rucio supports different RSE protocol implementations (essentially acting as scheme handlers). With GridFTP and SRM being phased out, and GFAL support expected to cease in the future, now is the time to gather the requirements of our communities and plan ahead.
The INFN (the Italian National Institute for Nuclear Physics) operates, since more than two decades, one of the italian largest distributed computing infrastructure, providing computing and storage resources for more than 100 scientific collaborations. A sizable fraction of the computing capacity integrates with the WLCG (Worldwide LHC Computing Grid) infrastructure, while others are...
With the ESCAPE project, Rucio demonstrated its flexibility in delivering efficient, production-ready solutions for communities beyond high-energy physics.
Within EOSC, the emerging Federation will consist of multiple interconnected Nodes designed to share and manage data, knowledge, and resources across thematic and geographical research domains.
Building on the achievements of ESCAPE,...
Large-scale experiments, such as those in gravitational-wave (GW) science, generate massive datasets stored in isolated Data Lakes, which hinders collaboration and efficient data analysis. The MADDEN (Multi-RI [Research Infrastructure] Access and Discovery of Data for Experiment Networking) project aims to overcome this by extending Rucio to enable read-only access to data for users of other...
An overview on how to create, use and maintain a policy package in Rucio
This report will introduce the application of Rucio at IHEP in last year, including running status, upgrading, plugins development for experiments at IHEP.
This talk will provide an overview of the status of Rucio at the Port d'Informació Científica (PIC). We'll detail our current and future plans for our different Rucio instances, which are used to manage data for experiments like MAGIC. The presentation will also highlight our latest developments within the Rucio ecosystem.
The Einstein Telescope is the third-generation ground-based observatory for Gravitational Waves in preparation phase in Europe. It is expected to observe a sky volume one thousand times larger than the (current) second generation observatories and this will be reflected in a higher observation rate. The physics information contained in the strain time series will increase, while on the machine...
RI-SCALE develops secure, large-scale data management and AI-driven analysis platforms for European Research Infrastructures. The project addresses the challenge of unlocked scientific value in massive, underutilized datasets by providing Data Exploitation Platforms with integrated AI/ML capabilities. We are deploying a Rucio test instance to serve as our core data orchestration layer,...
This tutorial will guide operators through deploying the Rucio WebUI in a Kubernetes cluster and understanding its requirements. It will also show developers how to set up the development environment and contribute to the WebUI’s pages, client, and API layers.