EOS 2025 Workshop


The 9th EOS workshop is in preparation to bring together the EOS community.
The two and half day in-person event is organized to provide a platform for exchange between developers, users and sites running EOS. We are in particular welcoming newcomers to join the community.
The workshop takes place at CERN.
This workshop is part of 
TechWeekStorage25 "Spotlight on Storage & Data Technologies at CERN"
taking place from 24th to 28th of March 2025. 
The workshop will cover a wide range of topics related to EOS development, operations, deployments, applications, collaborations and diverse use-cases!
Agenda Highlights:
- EOS Project Roadmap
- EOS Development and Operations at CERN
- EOS Deployment and Operations world-wide
Recordings
 
All presentation will be recorded and published with previous agreement of the speaker.
Fees
The workshop participation will be without fee.
Registrations
Registration is open to anyone at this link.
If you are interested in joining the EOS community, this is the perfect occasion!
We look forward to having you at the in-person workshop in March 2025 during TechWeek25!
Your CERN EOS team.
- 
                    
                        
                            
                        
                    
                    - 
        
            
                
        15:00
    
    
        →
        
            16:05
        
    
            
        
        - 
        
            
                
        15:00
    
    
            
        
        EOS 5.2 and 5.3 Status / Overview 20mThis presentation will give a short overview of the past releases and significant changes, new features and bug fixes. Speaker: Elvin Alin Sindrilaru (CERN)
- 
        
            
                
        15:20
    
    
            
        
        EOS and XRootD HTTP improvements 25mWith the continuous growth in the use of the HTTP protocol for file transfers within the WLCG community, several enhancements and optimisations have been introduced to the EOS HTTP and XRootD HTTP stacks. From updates to the SciTags and packet marking specifications to addressing libcurl internal modifications, 2024 presented a number of challenges that required targeted solutions. This presentation will provide an overview of the key changes and new features implemented to enhance the handling of HTTP file transfers in EOS and XRootD. Speaker: Cedric Caffy (CERN)
- 
        
            
                
        15:45
    
    
            
        
        Storage Tiering in EOS 20mWe will give an overview of new features for storage tiering in EOS version 5.3 Speaker: Andreas Joachim Peters (CERN)
 
- 
        
            
                
        15:00
    
    
            
        
        
- 
        
            
        16:05
    
    
        →
        
            16:25
        
    
        
        
- 
        
            
                
        16:25
    
    
        →
        
            17:35
        
    
            
        
        - 
        
            
                
        16:25
    
    
            
        
        QClient Improvements for the next fastest Metadata 20mEvery operation that modifies/queries the metadata from the persistent metadata storage QuarkDB goes via QClient. We look at some current bottlenecks and improvements that v5.3 offers with various configurations. Speaker: Mr Abhishek Lekshmanan (CERN)
- 
        
            
                
        16:45
    
    
            
        
        Advancements in FSCK for EOS 20mOne of a critical components in EOS is fsck, responsible for scanning, verifying, and repairing inconsistencies in the filesystem. This talk will provide an in-depth exploration of fsck in EOS, covering its architecture, scanning mechanisms, and repair strategies. We will discuss recent improvements, including the introduction of a best-effort mode, and enhancements in erasure-coded file scanning, which significantly boost performance while minimizing the impact on the running instance. Speaker: Gianmaria Del Monte (CERN)
- 
        
            
                
        17:05
    
    
            
        
        XRootD File Cloning 15mA software development motivated by an EOS use case is explained: file cloning to facilitate updates of erasure-coded files. Speaker: David Smith (CERN)
- 
        
            
                
        17:20
    
    
            
        
        Status of the S3 Interface for EOS 15mWe will present an overview of the current state of the S3 gateway for EOS. Speaker: Andreas Joachim Peters (CERN)
 
- 
        
            
                
        16:25
    
    
            
        
        
- 
        
            
                
        18:30
    
    
        →
        
            22:30
        
    
            
        
        Dinner: EOS Auberge de MeyrinAuberge de MeyrinAvenue de Vaudagne 13bis 1217 Meyrin- 
        
            
        18:30
    
    
        
        Social Dinner 2h 30mSocial Dinner in Meyrin Village. 
 
- 
        
            
        18:30
    
    
        
        
 
- 
        
            
                
        15:00
    
    
        →
        
            16:05
        
    
            
        
        
- 
                    
                        
                            
                        
                    
                    - 
        
            
                
        09:30
    
    
        →
        
            11:00
        
    
            
        
        Operational Tools & Configuration: EOS 40/S2-D01 - Salle Dirac- 
        
            
                
        09:30
    
    
            
        
        Deploying an EOS Instance from Scratch: A Practical Guide 20mEOS is a powerful and flexible storage system, but setting up a new instance from scratch requires a solid understanding of its configuration and operational best practices. This talk will provide a step-by-step guide to deploying EOS, covering key components and essential configurations. We will walk through the setup process, including storage provisioning, replication, erasure coding, and balancing strategies. The session will also touch on best practices for performance tuning and ensuring reliability in production environments. This talk is ideal for system administrators and operators looking to gain practical insights into EOS deployment, whether for testing, small-scale clusters, or large production environments. Speaker: Gianmaria Del Monte (CERN)
- 
        
            
                
        09:50
    
    
            
        
        Data Federations with EOS 20mData federations with EOS offers various approaches to seamlessly integrate and manage distributed storage across heterogeneous environments. This presentation explores multiple federation techniques and namespace aggregation with remote EOS instances. We will discuss the advantages and trade-offs of each method, considering factors such as performance, scalability, security, and ease of management. Real-world use cases and best practices will be highlighted to help organisations choose the most suitable strategy for their needs. Speaker: Luca Mascetti (CERN)
- 
        
            
                
        10:10
    
    
            
        
        A Distributed Probe for EOS: Real-Time Availability Monitoring and Alerting 20mEnsuring the availability of EOS instances is crucial for large-scale storage operations. To enhance monitoring and incident response, we have developed a new distributed probe designed to detect and alert operators about instance malfunctions in real-time. This talk will introduce the architecture and functionality of the probe, which runs across multiple nodes to provide redundancy and reliability. Alerts are dispatched via multiple channels, including SMS, email, Mattermost, and CERN IT’s General Services Availability. Additionally, all availability events are published on a NATS-based pub-sub channel, enabling future integrations with operational tools such as EOS Diagnostic Tool. Speaker: Gianmaria Del Monte (CERN)
- 
        
            
                
        10:30
    
    
            
        
        Diagnostic tool for submitting useful information for future debugging 25mFor a stuck/non responsive EOS MGM, some simple diagnostic information can go a long way. We look at a new eos-diagnostic-tool for dumping stacktraces etc. for submitting useful bug reports. We also invite discussions on how to improve the tooling for the future. Speaker: Abhishek Lekshmanan (CERN)
 
- 
        
            
                
        09:30
    
    
            
        
        
- 
        
            
        11:00
    
    
        →
        
            11:20
        
    
        
        
- 
        
            
                
        11:20
    
    
        →
        
            12:20
        
    
            
        
        - 
        
            
                
        11:20
    
    
            
        
        A Distributed Storage Odyssey: from CentOS7 to ALMA9 20mOn the 30th of June 2024, the end of CentOS 7 support marked a new era for the operation of the multi-petabytes distributed disk storage system used by CERN physics experiments. The EOS infrastructure at CERN is composed of aproximately 1000 disk servers and 50 metadata management nodes. Their transition from CentOS 7 to Alma 9 was not as straightforward as anticipated. This presentation will be all about explaining this transition. From the change of supported certificate and kerberos key signature lengths and algorithms, to openssl library hiccups and Linux kernel crashes, the EOS operation team had to take on different challenges to ensure a seamless operating system transition of the infrastructure while maintaining uninterrupted CERN experiments’ data transfers. Speaker: Cedric Caffy (CERN)
- 
        
            
                
        11:40
    
    
            
        
        Evaluating Jumbo frames performance across LHC experiments 20mThis work presents an evaluation of JUMBO frame tests conducted at CERN to assess their impact on data transfer performance across different physics workflows. Preliminary internal tests were carried out to analyze potential benefits and challenges, followed by collaborative testing involving the ATLAS, CMS, and LHCb experiments. The goal was to measure the advantages of JUMBO frames in terms of efficiency and throughput while identifying and resolving any issues arising from their deployment. The study provides insights into the feasibility of JUMBO frames for large-scale scientific data transfers, aiming to optimize network performance for high-energy physics experiments. Speaker: Dr Maria Arsuaga Rios (CERN)
- 
        
            
                
        12:00
    
    
            
        
        Refurbishing the Meyrin Data Centre: Storage Juggling and Operations 20mThe 50-year-old Meyrin Data Centre (MDC), still remains indispensable due to its strategic geographical location and unique electrical power resilience even if CERN IT recently commissioned the Prévessin Data Centre (PDC), doubling the organization’s hosting capacity in terms of electricity and cooling. The Meyrin Data Centre (Building 513) retains an essential role for the CERN Tier-0 Run 4 commitments, notably as primary hosting location for the tape archive and the disk storage. The inevitable investments to the infrastructure (UPS and Cooling) are now triggering the refurbishment of the two main rooms where all the storage equipment is hosted. This presentation will delve into the architectural advancements and operational strategies implemented for and during the Meyrin data centre refurbishment. We will explore how these developments will impact our storage and how the storage operations team will ensure EOS’s performance, scalability, and reliability in the coming years. Speaker: Octavian-Mihai Matei (CERN)
 
- 
        
            
                
        11:20
    
    
            
        
        
- 
        
            
                
        14:00
    
    
        →
        
            15:30
        
    
            
        
        - 
        
            
                
        14:00
    
    
            
        
        EOS for Physics at CERN: Operational Insights, Achievements, and Future Directions 25mThis work presents an overview of the EOS operations at CERN, focusing on its role in supporting physics data processing and storage. EOS is a high-performance distributed storage system designed to handle the vast volumes of scientific data generated by CERN experiments. This study examines key performance metrics, recent achievements, and strategic objectives for the current year, emphasizing improvements in efficiency, reliability, and scalability. Special attention is given to the impact of EOS on physics workflows, ensuring seamless data access and analysis. By evaluating past accomplishments and future goals, this work highlights the continuous evolution of EOS to meet the growing demands of physics research at CERN. Speaker: Dr Maria Arsuaga Rios (CERN)
- 
        
            
                
        14:25
    
    
            
        
        EOS Status at IHEP 20mIn this talk, we want to share our experiences of EOS at IHEP, including migration from CentOS 7 to Almalinux 9, construction of Alice EOS, and dual-site deployment of LHCb T1 EOS. Speaker: Dr Yujiang BI (Institute of High Energy Physics, Chinese Academy of Sciences)
- 
        
            
                
        14:45
    
    
            
        
        EOS site report of the Joint Research Centre 20mThe Joint Research Centre (JRC) of the European Commission is running the Big Data Analytics Platform (BDAP) to enable the JRC projects and scientists to store, process, and analyze a wide range and large amount of data, and to share and disseminate data products. EOS is the main system of BDAP for storing scientific data. The BDAP services are actively used by more than 100 JRC projects, covering a wide range of data analytics activities. The EOS instance at JRC has been implemented in 2016 and has currently a gross capacity of 43 PB. It is composed of heterogeneous commodity hardware components which has been extended noticeably over time. The talk will present the EOS service at JRC as storage back-end of the Big Data Analytics Platform. The presentation covers the EOS setup, configuration and current status. It describes the activities over the last year, presents experiences made and issues discovered, and gives an outlook of planned activities during 2025. Speaker: Armin Burger
- 
        
            
                
        15:05
    
    
            
        
        Planning an EOS Data Federation to deal with Climate Change using AI 20mThe National Institute for Space Research - INPE (Brazil) is leading a research program: Intelligent Early Warning System for Climate Extremes - SIPEC. The project aims at predicting the likelihood of climate extremes, months in advance using a diverse source of data coming from satellites and an array of intelligent sensors spread across the country. Such data streams will feed both classical meteorological models and AI machine learning algorithms for the ultimate early warning of climate extremes. Given the number of institutions producing large amounts of data needed to train the ML algorithms by scientists dealing with different parts of the problem, at different places, we are implementing an EOS Data Federation in Brazil. The implementation of the EOS family of tools, in addition to being capable to deal with large volumes of distributed data, also takes care of security controls for who has access to what portions of the datasets. Speakers: Dr Paulo Nobre (INPE), Wanderley Mendes (INPE)
 
- 
        
            
                
        14:00
    
    
            
        
        
- 
        
            
        15:30
    
    
        →
        
            15:50
        
    
        
        
- 
        
            
                
        15:50
    
    
        →
        
            17:05
        
    
            
        
        - 
        
            
                
        15:50
    
    
            
        
        Cloud-Native EOS Deployment for ATLAS T2 on Kubernetes 20mI will discuss our Kubernetes-based EOS deployment as it approaches production readiness for our ATLAS T2 site, as well as evaluation of EOS for several astronomy projects. Speaker: Ryan Taylor (University of Victoria (CA))
- 
        
            
                
        16:10
    
    
            
        
        CERNBox and EOSHPM status update 20mCERNBox and EOS HOME/PROJECT(/MEDIA) operational issues seen in 2024 and expected in 2025. Speakers: Jan Iven (CERN), Diogo Castro (CERN)
- 
        
            
                
        16:30
    
    
            
        
        You still have those QDB backups, right? (Practical example of disaster recovery of EOS deployment) 20mIn December of 2024 the EOS cluster at Purdue University suffered a security incident which wiped out all metadata of our production deployment. In this brief talk we will give a step-by-step example of what it takes to recover from such setback, and discuss the best backup practices. Speaker: Stefan Piperov (Purdue University (US))
 
- 
        
            
                
        15:50
    
    
            
        
        
 
- 
        
            
                
        09:30
    
    
        →
        
            11:00
        
    
            
        
        
- 
                    
                        
                            
                        
                    
                    - 
        
            
                
        09:30
    
    
        →
        
            10:40
        
    
            
        
        Benchmarking & Hardware Evolution: EOS 40/S2-D01 - Salle Dirac- 
        
            
                
        09:30
    
    
            
        
        Analysis Benchmarking with EOS/RNTuple 30mThis presentation will report about the benchmarking results of various EOS setups at CERN using the new RNTuple framework. Speaker: Andreas Joachim Peters (CERN)
- 
        
            
                
        10:00
    
    
            
        
        XRootD Update & Parallel Socket Benchmarking 20mSpeaker: Guilherme Amadio (CERN)
- 
        
            
                
        10:20
    
    
            
        
        Storage Hardware at CERN 20mCurrent & future storage hardware at CERN. Speaker: Luca Mascetti (CERN)
 
- 
        
            
                
        09:30
    
    
            
        
        
- 
        
            
        10:40
    
    
        →
        
            11:00
        
    
        
        
- 
        
            
                
        11:00
    
    
        →
        
            12:00
        
    
            
        
        - 
        
            
                
        11:00
    
    
            
        
        The EOS Development Workplan & Roadmap 20mWe will outline the EOS development roadmap, highlighting key milestones, upcoming features, and future plans. This presentation will provide insights into ongoing improvements, strategic goals, and the evolving direction of EOS. Speakers: Andreas Joachim Peters (CERN), Elvin Alin Sindrilaru (CERN)
- 
        
            
                
        11:20
    
    
            
        
        Discussion, Proposals, Feature Requests 40mSurvey Topics 
 - Ansible Configuration for EOS
 - SquashFS as small File Repository
 
- 
        
            
                
        11:00
    
    
            
        
        
- 
        
            
                
        14:00
    
    
        →
        
            16:00
        
    
            
        
        - 
        
            
                
        14:00
    
    
            
        
        How to benchmark EOS with bash? 30mSpeaker: Andreas Joachim Peters (CERN)
- 
        
            
                
        14:30
    
    
            
        
        How to setup authentication Front-ends? 30mSpeaker: Elvin Alin Sindrilaru (CERN)
- 
        
            
                
        15:00
    
    
            
        
        How to configure TLS & ZTN in XRootD? 30mSpeaker: Guilherme Amadio (CERN)
- 
        
            
                
        15:30
    
    
            
        
        How to transtion from MQ to no MQ? 30mSpeaker: Elvin Alin Sindrilaru (CERN)
 
- 
        
            
                
        14:00
    
    
            
        
        
 
- 
        
            
                
        09:30
    
    
        →
        
            10:40