The 3rd CTA Workshop is a two-day in-person event which will bring together developers, users and sites running CTA software. Newcomers are also welcome to join the community.
The workshop will feature a programme of technical presentations, site reports, hands-on sessions and the roadmap for CTA's plans and future developments.
Previously, the annual CTA Day has been organised as part of the EOS Workshop. This year for the first time, the CTA Workshop has its own separate identity and registration. The CTA Workshop will be co-located with the EOS Workshop and CS3 conference as part of CERN's TechWeekStorage24 (11-19 March 2024).
The workshop will take place at CERN, in the IT Auditorium, on 18-19 March 2024.
Monday morning and Tuesday morning will be dedicated to technical presentations and site reports.
Monday afternoon is reserved for hands-on sessions.
The EOS Workshop 2024 will kick off on Thursday 14 March with a plenary session giving an overview of the storage technologies developed and used at CERN: EOS Open Storage (disk storage), CTA (tape archival storage), Ceph (block, file and object storage) and CERNBox (sync and share platform for collaboration).
In the afternoon, there is a Meet the Team event with the development and operations teams from CERN's IT Storage Group.
Please note that these two events must be registered for separately from the CTA Workshop.
All presentations will be recorded and published (subject to agreement of the speakers).
Participation in the workshop is free of charge.
There will be a workshop social event on Sunday 17 March and a workshop dinner on Monday 18 March. Participation is optional and is at the participant's own expense. See Social Programme for details and registration for the social events.
Registration is open to anyone. Please register here.
If you would like to share your experience with the community, please submit an abstract (deadline 29 February 2024).
We look forward to seeing you at the in-person workshop in March 2024 during TechWeekStorage24!
—The CTA team
Welcome to the third annual CERN Tape Archive Workshop (CTA 2024).
The LHC experiments have completed another successful year of data taking, with new records in terms of throughput and total data archived.
This presentation will give an introduction to the CTA Project, Team and Community, as well as an overview of the challenges and achievements during the second year of LHC Run-3.
Today, magnetic tape is the lowest cost technology for storing large volumes of data. Historically, areal density scaling has been the main driver of the exponential decrease in cost, i.e. cents/GB of capacity, of both tape and hard disk drives. Recently, HDD scaling has stagnated while tape continues to scale at historical rates. The INSIC 2019-2029 Tape Technology Roadmap projects that tape areal density will scale with a 40% compound annual growth rate enabling an approximate doubling of capacity every two years until at least 2029, at which point tape systems are expected to reach an areal density of 278 Gb/in2. The feasibility of these projections have been validated by a recent research single channel tape areal density demonstration of 317 Gb/in2 on SrFe tape. This demo as well as the latest tape technology roadmap will be discussed and contrasted with hard disk drives.
Antares is the tape archive service at RAL that manages both Tier-1 and local Facilities data. In this talk, we present the main developments in the service since last year’s CTA workshop including the migration of CASTOR Facilities instance and discuss the service’s performance as well as the main operational issues since the beginning of LHC Run-3. Finally, we provide an overview of the future plans for the expansion of the service to cover additional use cases from physics (CLF) and astronomy (SKA).
CTA has been in production at DESY for almost an year, with dCache as the frontend. Over the course of this time we have migrated all our data from OSM, moved all experiments to CTA and so far written close to 25PB of data. In this talk we give an overview of the past year and share our overall experience with CTA.
The presentation will provide an overview of the status and progress of CTA at IHEP. Last year, we completed the data migration from CASTOR to CTA, added new CTA instances for JUNO, HEPS, and LHCb Tier1. We compiled CTA 5.8.10 on alma9 and have commenced performance testing. Additionally, CTA has been utilized on the new tape library and LTO9 tapes.
At the 2nd CTA Workshop (2023), we announced a repository of Free and Open Source operator tools for CTA, including the Repack automation system (ATRESYS) tool. Since then, the repository has seen a number of new tool additions, as well as changes to the underlying libraries.
In this talk, we will introduce these new features, including tools for: automating the supply of tapes to tape pools; automated validation of data on tape; EOSCTA disk file metadata operations; and higher-level CTA operator tasks. Together, these make up the backbone of the tape operations ecosystem at CERN. We present these contributions in the hope that they will be as useful to the CTA community at large.
The Tape Alerting System (TAS) acts automatically on issues detected in the tape infrastructure by disabling the affected elements, notifying operators of the situation requiring their attention, and thereby protecting the system from further disruption or damage to the infrastructure.
This is done by performing a scan of a configurable time frame of past tape sessions and then executing a number of detection jobs corresponding to the set of configurable error conditions.
This presentation will give an introduction to the TAS, walk you through the potential error conditions this operator tool reacts to, and give instructions on how to install it.
During three years in production at CERN as the WLCG Tier-0 tape storage service, the CTA software has evolved and gained various features driven by the service requirements. Discussions between developers and the operations have led to design choices that shaped the service and helped refine CTA operations best practices.
This presentations aims at clarifying the current operations and user data transfer best practices when dealing with a CTA tape system.
Recent years have witnessed an increase in the number and sophistication of cyberattacks on academic sites. One way to improve resilience is to store backups in a completely separate administrative domain to the data being protected. This Birds-of-a-Feather session is for site administrators who would like to discuss the possibility of backing up to each others sites.
The goal of this hands-on session is to be able to debug common problems that could be faced by CTA administrators or operators. We will briefly give a broad overview of all the logging generated by the different CTA services, along with a peek into our monitoring dashboards. This gives a starting point to be able to understand and debug problems that may appear while operating CTA.
After that, a series of problems will be presented to be solved during the session.
Deletions are a rare event in tape archive storage operations. When they happen, they are often the result of a user mistake or a bug. As deletion is asynchronous on the tape side, nothing is physically deleted until the tape is reclaimed. CTA keeps a copy of all the deleted file metadata so that files deleted unintentionally can be restored. This hands-on session will allow participants to understand what happens to deleted tape files in the CTA tape lifecycle and how to use the CTA restore tools.
This presentation will aim to provide a yearly site report. This year at AARNet, we implemented tape limiting and offsite tape vaulting processes and workflows. CTA was also an integral part of decommissioning efforts of some of our storage platforms.
In 2024, Fermilab will replace the legacy Enstore tape management system with CTA for CMS data. This will be followed by the migration of the second instance of Enstore containing all other scientific data on tape at Fermilab.
We will detail the results of on-going scale tests of CTA. These tests target the 10% scale in tape bandwidth compared to the existing CMS Enstore instance. Tests will be performed with both EOS and dCache as buffer file systems.
We will also cover code development efforts necessary to make all tapes written by Enstore readable in CTA.
Throughout 2023, we have continued conducting operational tests to familiarize ourselves with the functionality of CTA and understand the differences from Enstore. In addition to configuration tests of the application to observe how resource allocation works or to configure automatic tape provisioning, efforts have been made to deploy a centralized logging system to enhance the monitoring and operation of the application. Furthermore, preparations have begun for the migration tests from Enstore to CTA.
This
presentation summarizes the current state of tape storage systems at
the Joint Institute for Nuclear Research (JINR) and gives our current
vision of its future in response to the discontinuation of Enstore
support. We provide a brief description of two of our currently
operating tape systems: a 90 PB instance built on dCache/Enstore and
an 11 PB instance based by EOS/CTA. Additionally, we share initial
insights gained from operating the EOS/CTA system.
This paper presents one of the possible architectures for a container-based CTA installation designed for a midsize storage system. The primary objective of this setup is to store archive data efficiently and integrate it with an existing computing cluster.
The configuration of the CTA tape daemon was inherited from CASTOR, which relied on a manual process to create and store the configuration for each tape drive. As the scale of CERN's tape operations have grown to hundreds of drives, this approach has not scaled well. The CTA team have therefore developed a new semi-automated process to determine the configuration of each drive. The configuration has been simplified and refactored, making it easier to operate multiple tape drives per tape server. In this two-part presentation we will describe: the procedure to automatically obtain the drive information from the library; and the new tape daemon configuration file structure and how to configure multiple drives per tape server.
Due to the upcoming end-of-life of CERN CentOS 7, CTA will be migrated to Alma Linux 9, following the recommendations of the CERN Linux team. Migrating the CTA codebase from CC7 to Alma 9 presented a range of compatibility hurdles. This talk will delve into the challenges encountered and the strategies used to overcome them, including: managing version changes in vital dependencies like Protobuf and Oracle client libraries; resolving differences between CC7 and Alma9 package offerings; and ensuring C++ code compatibility. Additionally, we will address the transition from Docker to Podman, the adaptation of the versionlock file for Alma 9, modernisation of scripts from Python 2 to Python 3, handling potential Bash version incompatibilities, and strategies for integrating the CTA docker image into a local minikube environment.
The original design of EOSCTA inherited its cache eviction mechanism — stagerrm — from CASTOR. Although similar, CTA's use-cases were not the same as CASTOR's, leading to several operational issues. This motivated the creation of a new command — evict — better adapted to modern disk buffer management in CTA.
In this talk, we will discuss the issues that we faced with stagerrm and how the new evict command helps to fix them. We will also cover the new features brought by evict and how they simplify EOSCTA operations.
Recent improvements to CTA repack added several new features and tools for operators. However, we were still faced with severe performance issues when repacking tapes on a very large scale. An investigation showed that this was mostly due to limitations on the CTA SchedulerDB backend, which did not scale well to performing repack on the latest generation of very high capacity tapes, which can store 50 TB of data and millions of files. As a mitigation, while a new SchedulerDB is still in development, we decided to split the "user" and "repack" scheduler backends. This will effectively prevent repack jobs from interfering with user archival jobs. In this talk, we will discuss the limitations found during repack, the mitigations put in place, and why separating the scheduler into two backends was necessary to avoid performance issues during CTA operations.
The CTA software and service have been designed to match the Run-3 write performance requirements at WLCG Tier-0. The queuing system, coupled with a small SSD-based cache, has demonstrated its performance predictability over the first two years of the run.
This performance was achieved with FIFO scheduling, resulting in a pure temporal collocation on tapes, with only the legacy "storage class" concept to separate data into different tape pools.
As tape data is becoming "warmer" in experiment data workflows, the performance of data retrieval from tape is becoming more important. Optimising retrieval requires additional metadata to deliver staging efficiency gains, which can be exploited in the evolution of the CTA tape scheduler.
This presentation will focus on various scheduling issues observed during Run-3, and how archive metadata will enable improvements to the CTA tape site write and read efficiencies.
CTA software development is primarily driven by the needs of the CERN experimental programme. Looking beyond Run-3, data rates are set to continue to rise exponentially into Run-4 and beyond. The CTA team are planning how to scale the software and service to meet these new challenges.
CTA is also driven by the needs of the community outside CERN. The landscape of tape archival for scientific data is consolidating, and CTA is constantly adapting to a wider range of use cases.
This talk will present the short-term and medium-term roadmap for CTA development and new features.
Final comments, questions and discussion.