CERN Tape Archive Workshop : CTA 2025

Name: CERN Tape Archive Workshop : CTA 2025
Start: 2025-03-27T09:00:00+01:00
End: 2025-03-28T15:00:00+01:00
Location: CERN

27–28 Mar 2025

CERN

Europe/Zurich timezone

There is a live webcast for this event.

Contribution List

7. Welcome and Introduction

Michael Davis (CERN)

27/03/2025, 09:00

Plenary Sessions

Short talk

Plenary Session

Welcome to the fourth annual CERN Tape Archive Workshop (CTA 2025).

The LHC experiments have completed another successful year of data taking, with new records in terms of throughput and total data archived. This presentation introduces the CTA Project, Team and Community, as well as an overview of the challenges and achievements during the third year of LHC Run-3.

11. CERN Perspective on Tape Technology Evolution

Vladimir Bahyl (CERN)

27/03/2025, 09:15

CTA Operations

Plenary

Plenary Session

With LHC now in the middle of Run 3, we will present our current tape hardware setup and present our experience with the different components of the technology.
We will start with a reflection on the evolution of our capacity planning vs. increasing storage requirements of the experiments.
We will then report on performance characteristics of both LTO9 and TS1170 tape drives: RAO,...

2. CTA Status at DESY

Mwai Karimi

27/03/2025, 09:45

Site Reports

Short talk

CTA Operations and Site Reports

CTA has been in operation at DESY for nearly two years, during which time additional experiments have been integrated, and over 80PB of data has been written to tape. This presentation will provide an overview of recent developments, along with insights and experiences from running dCache+CTA so far

5. Fermilab site report and Enstore compatibility

Eric Vaandering (Fermi National Accelerator Lab. (US))

27/03/2025, 10:00

Site Reports

Short talk

CTA Operations and Site Reports

Fermilab will move to CTA this spring with dCache as the frontend file system. The modifications made to CTA to be able to read Enstore (Fermilab's legacy tape management software) files will be discussed, as will our solution to read the existing Enstore Small File Aggregation (SFA) files.

Operational issues arising during our push to production will be highlighted. Details on our...

6. CTA status at IHEP

qiuling yao (IHEP)

27/03/2025, 10:15

Site Reports

Short talk

CTA Operations and Site Reports

We will share our experiences and challenges with using CTA last year. We optimized the CTA configuration and upgraded the EOS&CTA. Additionally, we expanded the scale of two experimental applications, Tier 1 of LHCB and HEPS, and enhanced the monitoring of the CTA system.

25. Hardware provisioning for CTA

Julien Leduc (CERN)

27/03/2025, 11:00

CTA Operations

Short talk

CTA Operations and Site Reports

The CERN Tape Archive (CTA) was designed to meet the demands of data archival from the LHC experiments, in terms of both data volume and throughput. In order to ingest data at the rates demanded by the LHC data acquisition (DAQ) systems, the system is built on EOS and CTA's scalable architecture principles. To optimise the performance of both disk and tape hardware and to achieve the desired...

13. Insights from Operating CTA at CERN

Pablo Oliver Cortes (CERN)

27/03/2025, 11:15

CTA Operations

Short talk

CTA Operations and Site Reports

Operating the CERN Tape Archive all year-round does not come without surprises and challenges: massive recall campaigns, peak system throughput for archival during the data tacking period and (not so) transparent upgrades to critical services we depend on push the system to the limits, popping some nuts and bolts from time to time.

In this presentation, we will share insights gained from...

19. Configuring ATRESYS to use Repack tapepools

Mr Idriss Larbi

27/03/2025, 11:30

CTA Operations

Short talk

CTA Operations and Site Reports

Until now, repack and archival of new files were carried out on the same tapepool. This could lead to the user's new archive files being mixed on the same tape as the old repack files, reducing performance during retrieve. With the latest version of CTA, we can create dedicated REPACK archive routes in order to repack to separate tapepools. We have modified ATRESYS to accommodate this new...

21. CTA’s Retrieve Backpressure Mechanism: Status and Future

Pablo Oliver Cortes (CERN)

27/03/2025, 11:45

CTA Operations

Short talk

CTA Operations and Site Reports

CTA was designed with two goals in mind: throughput to and from the tape system and minimising the stress on the tape infrastructure (minimising the number of tape mounts). These two constraints become particularly challenging in retrieval dataflows when elements external to the system start to misbehave.

In this presentation, we will explore the internal logic behind CTA’s retrieval...

4. Migrating CTA Frontend from XRootD/SSI to gRPC

Konstantina Skovola

27/03/2025, 12:00

CTA Development

Short talk

CTA Development and Roadmap

The CTA Frontend serves the physics workflow event requests made by the disk system buffer, for example EOS or dCache, and cta-admin commands submitted by operators and automated scripts. Communication with the frontend is currently based on XRootD/SSI. However, not all disk front-ends support the SSI extensions to XRootD, which creates a constraint for sites using CTA as a tape backend. As a...

3. Progress Report on CTA Scheduler DB

Dr Jaroslav Guenther (CERN)

27/03/2025, 12:15

CTA Development

Short talk

CTA Development and Roadmap

The CERN Tape Archive (CTA) scheduling system manages the workflow of archive, retrieve, and repack requests, relying on a Scheduler database (Scheduler DB) for transient metadata storage. We present the development of a new relational database (PostgreSQL) backend for the Scheduler DB. The aim is to improve the limitations of the current (object-store based) implementation. This talk will...

15. Evolution of Continuous Integration for CTA

Niels Alexander Buegel

27/03/2025, 14:45

CTA Development

Short talk

Hands-on Session

CTA's Continuous Integration (CI) system has been around since the inception of the project. However, numerous limitations had piled up: the CI setup was monolithic, developments outside of CTA were difficult to test (EOS, XRootD and dCache), the pipelines were slow and large parts of the CI system were not nicely structured. Over the past year, the CTA team has made significant improvements...

18. Streamlining CTA version upgrades

Joao Afonso (CERN)

27/03/2025, 15:00

CTA Development

Short talk

Hands-on Session

The CTA software versioning numbering scheme distinguishes between standard code releases and “pivot” releases aimed at upgrading the CTA Catalogue schema. This separation provides us with a simple and replicable set of steps for upgrading CTA between any two versions.
This talk will present the strategy for versioning the CTA software, explain how CTA Catalogue schema upgrades are integrated...

16. Deploying a CTA Test Instance using Helm

Niels Alexander Buegel

27/03/2025, 15:15

CTA Operations

Short talk

Hands-on Session

The current CTA test instance is deployed using Helm, a tool that streamlines the installation and management of Kubernetes applications. It allows us to clearly separate and template the different components that make up CTA. In this talk, we will walk through how the new containerized CTA setup works, covering how the components are organized, how Helm is used to manage configurations, and...

17. Upgrading the CTA Catalogue

Niels Alexander Buegel

27/03/2025, 15:50

Hands-on Sessions

Hands-on session

Hands-on Session

In this hands-on, participants will deploy their own containerized CTA + EOS setup and perform a catalogue upgrade from version 14.0 to 15.0. The goal is to walk through the full upgrade procedure so that participants understand the steps necessary to smoothly do a catalogue upgrade of CTA. Participants are expected to bring their own laptops and will be provided with an OpenStack VM.

23. Scaleway S3 Glacier: tape archiving for the masses

Florian FLORENSA (Scaleway), Saalik HATIA (Scaleway)

28/03/2025, 09:00

Plenary Sessions

Plenary

Plenary Session

For six years Scaleway offered S3 Glacier-class storage fully operated on mostly powered-off SMR disks.
This enabled us to offer our customers fast restore time while remaining cost effective, using a mix of commodity and custom hardware.

As our S3 service grew, and since we cannot control nor predict public cloud workloads, the former glacier stack was unable to keep up while becoming...

10. Cold storage using S3+GLACIER+CTA

Michael Davis (CERN)

28/03/2025, 09:30

CTA Development

Short talk

Plenary Session

S3 is a popular protocol for object storage, in use at CERN since 2018. CERN's Ceph S3 service provides a disk storage back-end for various use cases including backup. In principle it is possible to archive S3 objects to tape using the S3 GLACIER storage class extensions. However, this is not yet supported in an open source solution. This talk gives a brief overview of the landscape of support...

20. Developments in the Antares service at RAL

George Patargias

28/03/2025, 09:45

Site Reports

Short talk

CTA Operations and Site Reports

Antares is the tape archive service at RAL that manages both Tier-1 and local Facilities data. In this talk, we present the main operational changes and developments in the service since last year’s CTA workshop. Among others, these include the migration of the service from SL7, the CTA Frontend separation and the deployment of the new EOS nodes connected to the LHC-OPN network in our Tier-1...

1. CTA at PIC. Site report.

Jordi Casals (PIC)

28/03/2025, 10:00

Site Reports

Short talk

CTA Operations and Site Reports

This year, we have upgraded CTA to Alma 9 and worked on automating the platform installation using Puppet. Additionally, we have tested the CTA Operations modules along with other features, such as policy mount rules. In February-March, we plan to conduct performance tests by allocating more resources to our test environment.
Unfortunately, we are still facing issues with humidity in the...

22. Tape system at TRIUMF, recent updates, CTA evaluation for site

Mr Simon Liu (TRIUMF (CA))

28/03/2025, 10:15

Site Reports

Short talk

CTA Operations and Site Reports

The Tapeguy is TRIUMF's home build tape system for ATLAS T1 data center, It was designed to be a stable system that can reliably store and retrieve LHC produced data as a tiered HSM system. we also open to other solutions, evaluated CERN CTA at site in 2024. the talk will present current tapeguy status, recent updates, and the evaluation done at site.

12. Offsite Tape Backup Collaboration

Vladimir Bahyl (CERN)

28/03/2025, 10:30

Plenary Sessions

Short talk

CTA Operations and Site Reports

This talk is a follow-up from the 2024 BoF session on Offsite Tape Backup between sites. We will present the proof-of-concept architecture that we plan to develop in 2025. We propose to test it with one collaborating Tier-1 site (yet to be identified).

14. CTA Tape Daemon Evolution

Pablo Oliver Cortes (CERN)

28/03/2025, 11:15

CTA Development

Short talk

CTA Development and Roadmap

In 2024, the CTA Tape Daemon was updated to address issues in deployments with multiple drives per tape server. This was a first step towards a major refactoring of the daemon, as in its current state, its multi-process architecture presents problems such as logging information unrelated to the current process and inter-process communication bugs. It also causes confusion in internal...

26. Tape reading efficiency gain by collocation

Luc Goossens (CERN)

28/03/2025, 11:30

CTA Operations

Short talk

CTA Development and Roadmap

Tape reading efficiency, defined as the ratio between the effective average data reading rate and the maximal data reading rate, is reduced by two operations the tape drive inevitably needs to do and during which it can not read any data. The first is mounting the tape containing the file into the drive, after possibly having unmounted the tape that was in it before. The second is spooling the...

24. Optimising recall from tape using Archive Metadata

Julien Leduc (CERN)

28/03/2025, 11:45

CTA Development

Short talk

CTA Development and Roadmap

During Run-3, CTA has demonstrated very high write efficiency at nominal DAQ rates. For retrieval, CTA relies on time-based colocation of data on tape, but this has proved to be much less efficient than expected. Furthermore, the ratio of tape reads to writes is expected to significantly increase during Run-4, as some LHC experiments move towards the “tape carousel” model. Two years ago, we...

8. CTA Roadmap

Michael Davis (CERN)

28/03/2025, 12:00

Plenary Sessions

Short talk

CTA Development and Roadmap

CTA software development is primarily driven by the needs of the CERN experimental programme. Looking beyond Run-3, data rates are set to continue to rise exponentially into Run-4 and beyond. The CTA team are planning how to scale the software and service to meet these new challenges.

CTA is also driven by the needs of the community outside CERN. The landscape of tape archival for...

9. Discussion and close-out

Michael Davis (CERN)

28/03/2025, 12:15

Plenary Sessions

Short talk

CTA Development and Roadmap

Final comments, questions and discussion.

Choose timezone

CERN Tape Archive Workshop : CTA 2025