-
Michael Davis (CERN)27/03/2025, 09:00
Welcome to the fourth annual CERN Tape Archive Workshop (CTA 2025).
The LHC experiments have completed another successful year of data taking, with new records in terms of throughput and total data archived. This presentation introduces the CTA Project, Team and Community, as well as an overview of the challenges and achievements during the third year of LHC Run-3.
Go to contribution page -
Vladimir Bahyl (CERN)27/03/2025, 09:15
With LHC now in the middle of Run 3, we will present our current tape hardware setup and present our experience with the different components of the technology.
Go to contribution page
We will start with a reflection on the evolution of our capacity planning vs. increasing storage requirements of the experiments.
We will then report on performance characteristics of both LTO9 and TS1170 tape drives: RAO,... -
Mwai Karimi27/03/2025, 09:45
CTA has been in operation at DESY for nearly two years, during which time additional experiments have been integrated, and over 80PB of data has been written to tape. This presentation will provide an overview of recent developments, along with insights and experiences from running dCache+CTA so far
Go to contribution page -
Eric Vaandering (Fermi National Accelerator Lab. (US))27/03/2025, 10:00
Fermilab will move to CTA this spring with dCache as the frontend file system. The modifications made to CTA to be able to read Enstore (Fermilab's legacy tape management software) files will be discussed, as will our solution to read the existing Enstore Small File Aggregation (SFA) files.
Operational issues arising during our push to production will be highlighted. Details on our...
Go to contribution page -
qiuling yao (IHEP)27/03/2025, 10:15
We will share our experiences and challenges with using CTA last year. We optimized the CTA configuration and upgraded the EOS&CTA. Additionally, we expanded the scale of two experimental applications, Tier 1 of LHCB and HEPS, and enhanced the monitoring of the CTA system.
Go to contribution page -
Julien Leduc (CERN)27/03/2025, 11:00
The CERN Tape Archive (CTA) was designed to meet the demands of data archival from the LHC experiments, in terms of both data volume and throughput. In order to ingest data at the rates demanded by the LHC data acquisition (DAQ) systems, the system is built on EOS and CTA's scalable architecture principles. To optimise the performance of both disk and tape hardware and to achieve the desired...
Go to contribution page -
Pablo Oliver Cortes (CERN)27/03/2025, 11:15
Operating the CERN Tape Archive all year-round does not come without surprises and challenges: massive recall campaigns, peak system throughput for archival during the data tacking period and (not so) transparent upgrades to critical services we depend on push the system to the limits, popping some nuts and bolts from time to time.
In this presentation, we will share insights gained from...
Go to contribution page -
Mr Idriss Larbi27/03/2025, 11:30
Until now, repack and archival of new files were carried out on the same tapepool. This could lead to the user's new archive files being mixed on the same tape as the old repack files, reducing performance during retrieve. With the latest version of CTA, we can create dedicated REPACK archive routes in order to repack to separate tapepools. We have modified ATRESYS to accommodate this new...
Go to contribution page -
Pablo Oliver Cortes (CERN)27/03/2025, 11:45
CTA was designed with two goals in mind: throughput to and from the tape system and minimising the stress on the tape infrastructure (minimising the number of tape mounts). These two constraints become particularly challenging in retrieval dataflows when elements external to the system start to misbehave.
In this presentation, we will explore the internal logic behind CTA’s retrieval...
Go to contribution page -
Konstantina Skovola27/03/2025, 12:00
The CTA Frontend serves the physics workflow event requests made by the disk system buffer, for example EOS or dCache, and cta-admin commands submitted by operators and automated scripts. Communication with the frontend is currently based on XRootD/SSI. However, not all disk front-ends support the SSI extensions to XRootD, which creates a constraint for sites using CTA as a tape backend. As a...
Go to contribution page -
Dr Jaroslav Guenther (CERN)27/03/2025, 12:15
The CERN Tape Archive (CTA) scheduling system manages the workflow of archive, retrieve, and repack requests, relying on a Scheduler database (Scheduler DB) for transient metadata storage. We present the development of a new relational database (PostgreSQL) backend for the Scheduler DB. The aim is to improve the limitations of the current (object-store based) implementation. This talk will...
Go to contribution page -
Niels Alexander Buegel27/03/2025, 14:45
CTA's Continuous Integration (CI) system has been around since the inception of the project. However, numerous limitations had piled up: the CI setup was monolithic, developments outside of CTA were difficult to test (EOS, XRootD and dCache), the pipelines were slow and large parts of the CI system were not nicely structured. Over the past year, the CTA team has made significant improvements...
Go to contribution page -
Joao Afonso (CERN)27/03/2025, 15:00
The CTA software versioning numbering scheme distinguishes between standard code releases and “pivot” releases aimed at upgrading the CTA Catalogue schema. This separation provides us with a simple and replicable set of steps for upgrading CTA between any two versions.
Go to contribution page
This talk will present the strategy for versioning the CTA software, explain how CTA Catalogue schema upgrades are integrated... -
Niels Alexander Buegel27/03/2025, 15:15
The current CTA test instance is deployed using Helm, a tool that streamlines the installation and management of Kubernetes applications. It allows us to clearly separate and template the different components that make up CTA. In this talk, we will walk through how the new containerized CTA setup works, covering how the components are organized, how Helm is used to manage configurations, and...
Go to contribution page -
Niels Alexander Buegel27/03/2025, 15:50
In this hands-on, participants will deploy their own containerized CTA + EOS setup and perform a catalogue upgrade from version 14.0 to 15.0. The goal is to walk through the full upgrade procedure so that participants understand the steps necessary to smoothly do a catalogue upgrade of CTA. Participants are expected to bring their own laptops and will be provided with an OpenStack VM.
Go to contribution page -
Florian FLORENSA (Scaleway), Saalik HATIA (Scaleway)28/03/2025, 09:00
For six years Scaleway offered S3 Glacier-class storage fully operated on mostly powered-off SMR disks.
This enabled us to offer our customers fast restore time while remaining cost effective, using a mix of commodity and custom hardware.As our S3 service grew, and since we cannot control nor predict public cloud workloads, the former glacier stack was unable to keep up while becoming...
Go to contribution page -
Michael Davis (CERN)28/03/2025, 09:30
S3 is a popular protocol for object storage, in use at CERN since 2018. CERN's Ceph S3 service provides a disk storage back-end for various use cases including backup. In principle it is possible to archive S3 objects to tape using the S3 GLACIER storage class extensions. However, this is not yet supported in an open source solution. This talk gives a brief overview of the landscape of support...
Go to contribution page -
George Patargias28/03/2025, 09:45
Antares is the tape archive service at RAL that manages both Tier-1 and local Facilities data. In this talk, we present the main operational changes and developments in the service since last year’s CTA workshop. Among others, these include the migration of the service from SL7, the CTA Frontend separation and the deployment of the new EOS nodes connected to the LHC-OPN network in our Tier-1...
Go to contribution page -
Jordi Casals (PIC)28/03/2025, 10:00
This year, we have upgraded CTA to Alma 9 and worked on automating the platform installation using Puppet. Additionally, we have tested the CTA Operations modules along with other features, such as policy mount rules. In February-March, we plan to conduct performance tests by allocating more resources to our test environment.
Go to contribution page
Unfortunately, we are still facing issues with humidity in the... -
Mr Simon Liu (TRIUMF (CA))28/03/2025, 10:15
The Tapeguy is TRIUMF's home build tape system for ATLAS T1 data center, It was designed to be a stable system that can reliably store and retrieve LHC produced data as a tiered HSM system. we also open to other solutions, evaluated CERN CTA at site in 2024. the talk will present current tapeguy status, recent updates, and the evaluation done at site.
Go to contribution page -
Vladimir Bahyl (CERN)28/03/2025, 10:30
This talk is a follow-up from the 2024 BoF session on Offsite Tape Backup between sites. We will present the proof-of-concept architecture that we plan to develop in 2025. We propose to test it with one collaborating Tier-1 site (yet to be identified).
Go to contribution page -
Pablo Oliver Cortes (CERN)28/03/2025, 11:15
In 2024, the CTA Tape Daemon was updated to address issues in deployments with multiple drives per tape server. This was a first step towards a major refactoring of the daemon, as in its current state, its multi-process architecture presents problems such as logging information unrelated to the current process and inter-process communication bugs. It also causes confusion in internal...
Go to contribution page -
Luc Goossens (CERN)28/03/2025, 11:30
Tape reading efficiency, defined as the ratio between the effective average data reading rate and the maximal data reading rate, is reduced by two operations the tape drive inevitably needs to do and during which it can not read any data. The first is mounting the tape containing the file into the drive, after possibly having unmounted the tape that was in it before. The second is spooling the...
Go to contribution page -
Julien Leduc (CERN)28/03/2025, 11:45
During Run-3, CTA has demonstrated very high write efficiency at nominal DAQ rates. For retrieval, CTA relies on time-based colocation of data on tape, but this has proved to be much less efficient than expected. Furthermore, the ratio of tape reads to writes is expected to significantly increase during Run-4, as some LHC experiments move towards the “tape carousel” model. Two years ago, we...
Go to contribution page -
Michael Davis (CERN)28/03/2025, 12:00
CTA software development is primarily driven by the needs of the CERN experimental programme. Looking beyond Run-3, data rates are set to continue to rise exponentially into Run-4 and beyond. The CTA team are planning how to scale the software and service to meet these new challenges.
CTA is also driven by the needs of the community outside CERN. The landscape of tape archival for...
Go to contribution page -
Michael Davis (CERN)28/03/2025, 12:15
Final comments, questions and discussion.
Go to contribution page
Choose timezone
Your profile timezone: