EOS 2024 Workshop

Europe/Zurich
31/3-004 - IT Amphitheatre (CERN)

31/3-004 - IT Amphitheatre

CERN

105
Show room on map
Andreas Joachim Peters (CERN), Jakub Moscicki (CERN), Luca Mascetti (CERN)
Description




The 8th EOS workshop is in preparation to bring together the EOS community.

The two day in-person event is organized to provide a platform for exchange between developers, users and sites running EOS. We are in particular welcoming newcomers to join the community. 

The workshop takes place at CERN in the Council Chamber and the IT auditorium.


This time the workshop is part of 
TechWeekStorage24 "Spotlight on Storage & Data Technologies at CERN"
taking place from 11th to 19th of March 2024 comprising the CS3 conference, the EOS & CTA workshops and a CERNBox BoF session. 

The workshop will cover a wide range of topics related to EOS development, operations, deployments, applications, collaborations and diverse use-cases!

Agenda Highlights:

  • EOS Project Roadmap
  • EOS Development and Operations at CERN
  • EOS Deployment and Operations world-wide

 

Timetable 

The workshop will be kicked-off on Thursday morning with a plenary-style overview session about storage technologies used or developed at CERN.

  • EOS Open Storage - Disk Storage at CERN
  • CTA - the CERN Tape Archive
  • CEPH - Object & Application Storage
  • CERNBox - Sync & Share platform for collaboration
  • Filesystems - AFS, DFS, Samba, NFS3/4
  • CernVM FS - CernVM File System
  • FTS & XRootd- File Transfer Service & FIle Access
  • Rucio - Scientific Data Management

     

The afternoon is reserved for meet the team discussion with AFS, Ceph, CERNBox, CTA, CVMFS, DFS, EOS, FTS and XRootD experts. You can express your interest under Surveys.

The second day will focus on new developments and improvements since the previous workshop, the project roadmap, EOS operations and site reports.

Recordings
 

All presentation will be recorded and published with previous agreement of the speaker.

Fees

The workshop participation will be without fee.

Registrations

Registration is open to anyone.

Please register yourself to the workshop. Don't forget to submit an abstract if you would like to share your experience/ideas within the EOS community.

If you are interested in joining the EOS community, this is the perfect occasion!

We look forward to having you at the in-person workshop in March 2024 during TechWeek24!

Your CERN EOS team.

Registration
Remote Participants
Webcast
There is a live webcast for this event
  • Thursday, 14 March
    • 08:30 09:50
      EOS Development: Morning Session 1 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      105
      Show room on map
      Convener: Luca Mascetti (CERN)
      • 08:30
        EOS 5.2 Status 20m
        Speaker: Elvin Alin Sindrilaru (CERN)
      • 08:50
        EOS Namespace Locking Evolution 20m
        Speaker: Cedric Caffy (CERN)
      • 09:10
        EOS FlatScheduler & Freespace Engine 15m
        Speaker: Abhishek Lekshmanan (CERN)
      • 09:25
        EOS III 10m

        III = Instance Inventory Implementation

        We have added recently new tools to gather statics about storage hardware, resource usage, hardware lifecycle. These tools compute the virtual cost and value of user data and the hardware of an EOS instances.

        The presentation will introduce these new tools.

        Speaker: Andreas Joachim Peters (CERN)
      • 09:35
        EOSXd Evolution and CFSd 15m

        EOSXd is the filesystem client implemented as a FUSE filesystem to provide POSIX-like access to EOS. This is a crucial component for general usage of EOS. The presentation will highlight the problems we faced and advancements since the last workshop.

        EOS CFSd is a FUSE pass-through filesystem implementation allowing to add missing features to any general POSIX filesystems. The presentation will highlight some possible use-cases and performance measurements e.g. an example how CFSd adds kerberos based mapping to a CephFS filesystem without a significant performance impact.

        Speaker: Andreas Joachim Peters (CERN)
    • 09:50 10:10
      Coffee Break 20m 31/3-009 - IT Amphitheatre Coffee Area

      31/3-009 - IT Amphitheatre Coffee Area

      CERN

      30
      Show room on map
    • 10:10 12:00
      EOS Development: Morning Session 2 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      105
      Show room on map
      Convener: Elvin Alin Sindrilaru (CERN)
      • 10:10
        A new REST API Gateway for EOS 15m
        Speaker: Andreea Prigoreanu (IT-SD)
      • 10:25
        Implementing FSCK for Erasure Coded Files in EOS 10m
        Speaker: Mano Segransan (42 Lausanne (CH))
      • 10:35
        HTTP Improvements and SciTags 10m
        Speaker: Cedric Caffy (CERN)
      • 10:45
        EOS and fixes & low level changes for openssl, xrootd & eosxd 10m
        Speaker: David Smith (CERN)
      • 10:55
        EOS on ARM 5m
        Speaker: Abhishek Lekshmanan (CERN)
      • 11:00
        EOS on SMR Status 5m

        In close collaboration with the IT procurement team we have conducted R&D with EOS and SMR disks. The findings, current status and outlook on integration of SMR disks will be covered in this presentation.

        Speaker: Andreas Joachim Peters (CERN)
      • 11:05
        A native S3 interface EOS/XRootD 10m
        Speaker: Mano Segransan (42 Lausanne (CH))
      • 11:15
        XRootD Status and Plans 15m
        Speaker: Guilherme Amadio (CERN)
      • 11:30
        NDMSPC - EOS and N-Dimensional Analysis with ROOT, Enhanced by Web Interface and VR Visualization 10m

        This talk delves into the synergy of NDMSPC (NDiMensional SPaCe), EOS at CERN, and N-dimensional histograms via the ROOT framework. Discover how a web interface serves as a powerful analysis tool, enabling dynamic queries on projections in N-dimensional space. Interact and visualize the data seamlessly with JSROOT and VR (aframe), ushering in a new era of immersive exploration in high-energy physics.

        Speaker: Martin Vala (Pavol Jozef Safarik University (SK))
      • 11:40
        ALICE File Consistency Check System: A new solution based on EOS FSCK 10m

        Undoubtedly, the processing of ALICE experiment data relies on the quality and integrity of data. Currently, ALICE uses a distributed file crawler that periodically evaluates samples of files from each storage element in order to gather statistics about the number of corrupted or inaccessible files. The main issue with this solution lies in its inability to provide a comprehensive overview of a storage element status, as the analysis results are based on examining a random selection of files. This presentation will describe a new solution for the ALICE File Consistency Check System. The new approach will overcome the limitations of the file crawler by using the powerful consistency checking tools provided by EOS. The idea behind this project is to collect all the existing errors on an EOS instance from the reports generated by the FSCK command with the goal of reconciling the contents of the local storage with the central catalogue and, where possible, recover the lost content from other replicas. The output of the FSCK report command will be accessed through the new HTTP interface available in the latest versions of EOS.
        Hence, this solution not only produces a more accurate integrity analysis but automates the recovery of data loss as well.

        Speaker: Andreea Prigoreanu (IT-SD)
      • 11:50
        EOS Benchmarks at CERN 10m

        Before and after Run 3 have conducted several performance benchmarks as preparation of DC'24 and Run-3 in 2024 on various EOS instances at CERN. The presentation will report the findings of these benchmarks.

        Speaker: Andreas Joachim Peters (CERN)
    • 13:25 15:40
      EOS Operations: Site Deployments & Operations 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      105
      Show room on map
      • 13:35
        eosxd-csi: Mounting EOS volumes in Kubernetes 10m

        Kubernetes is seeing a huge adoption across the cloud, including the one at CERN. Already used in production, the EOSxd-CSI driver exposes EOS volumes as regular PersistentVolumeClaims that can be mounted by containerized workloads. This talk will touch on storage stack in Kubernetes, how EOS falls into the mix, and finally a demo showing the CSI driver in action.

        Speaker: Robert Vasek (CERN)
      • 13:45
        EOS4PHysics Operations 20m
        Speaker: Dr Maria Arsuaga Rios (CERN)
      • 14:05
        EOS ALMA9 Migration 15m
        Speaker: Ioanna Vrachnaki
      • 14:20
        EOS Hardware Procurement 10m
        Speaker: Luca Mascetti (CERN)
      • 14:30
        Long Term Monitoring with Prometheus + Thanos 15m

        The Storage and Data Management Group at CERN manages 20 EOS instances corresponding to almost 1000 servers and 100,000 disks. Having a good monitoring and alerting system is crucial not only for day-to-day activities but also as a tool to record the evolution of our services throughout the time. In this talk an overview of the monitoring tools that are used will be presented specially in regards of long-term metric preservation.

        Speaker: Roberto Valverde Cameselle (CERN)
      • 14:45
        Comparison of High-Performance Distributed File Systems on two Platforms: Linux and Windows 15m

        Subtitle: Superiority of EOS-based Comtrade Distributed File System (CDFS) for Earth Observation Data Storage.

        Introduction

        The rising quantity of data collected necessitates transitioning to the next generation of reliable, high-performance data storage solutions. Despite the clear need for high-performance storage across many industries, there has yet to be a consensus on the optimal high-performance storage technology that would suit all user needs.

        The project's goal was speed-testing to provide a possibility to combine them with the currently available state-of-the-art technology results, technological capabilities and existing expertise.

        Methodological approach

        We choose representative high-performance storage solutions compared on two different platforms for storage clients.

        • Linux:
          - Ceph
          - EOS
          - IBM Spectrum Scale
          - Hadoop
        • Windows:
          - Ceph
          - EOS-drive
          - EOS through Samba
          - EOS-wnc
          - Hadoop

        An example of specifications of the testing environment is the following.

        • Management node:
          - 32 threads
          - 2x Intel® Xeon® Silver 4208 Processor
          - 384 GB of RAM
          - 2x SSDs of 2 TB.
        • Storage nodes:
          - 32 threads
          - 2x Intel® Xeon® Silver 4208 Processor
          - 64 GB of RAM
          - 1x SSD of 2 TB for the operation system
          - 6x HDDs of 2 TB for data.
        • Client nodes:
          - 12 threads
          - 1x Intel® Core™ i5-12400 Processor
          - 16 GB of RAM
          - 1x SSD with 1 TB.

        Results

        The results are from the testing performed separately in an isolated environment for each high-performance solution but on the same hardware. For each platform, there were three categories of tests related to file sizes: (1) small, (2) medium, and (3) large files.

        According to the results, some high-performance file systems have evident advantages, as shown in our presentation. These results should be the starting point for an even more exact comparison between these file systems. They are good starting points in choosing the right high-performance file system.

        Speaker: Gregor Molan (Comtrade 360's AI Lab)
      • 15:00
        CERNBox Update 10m
        Speaker: Emmanouil Bagakis (CERN)
      • 15:10
        Site Report Vienna Tier-2 15m

        We share our operational experience of running a converged EOS instance for 3 Experiments (CMS, Alice, Belle).

        • In 2022 we've extended the capacity of the EOS cluster from 9 to 15 FSTs
        • In 2023 we've successfully updated to EOS 5.2.2
        • The system deployment and all operational tasks are fully automated with Ansible. Most recently a gap was closed with the automation of host certificate rotation.
        Speaker: Erich Birngruber (Austrian Academy of Sciences (AT))
      • 15:25
        EOS Operation Status at KISTI Tier-1 for ALICE experiment 15m

        We have been running disk and archive (replacing tape) storage for ALICE experiment with EOS for several years. In 2023, we upgraded our EOS instances to v5 from v4. More recently we deployed EOS based on Podman (instead of Docker), which is adapted as a native container runtime in EL9 distributions. In this work, we present the current status of EOS operation at KISTI Tier-1 centre based on Podman container with systemd integration.

        Speaker: Sang Un Ahn (Korea Institute of Science & Technology Information (KR))
    • 15:40 15:55
      Coffe Break 15m 31/3-009 - IT Amphitheatre Coffee Area

      31/3-009 - IT Amphitheatre Coffee Area

      CERN

      30
      Show room on map
    • 15:55 17:10
      EOS Operations 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      105
      Show room on map
      • 15:55
        EOS site report of the Joint Research Centre 15m

        The Joint Research Centre (JRC) of the European Commission is running the Big Data Analytics Platform (BDAP) to enable the JRC projects to store, process, and analyze a wide range of data and disseminate data products. The platform evolved as a core service for JRC scientists to produce knowledge and insights in support of EU policy making.

        EOS is the main storage system of the BDAP for scientific data. It is in increasing use at JRC since 2016. The Big Data Analytics Platform is actively used by more than 90 JRC projects, covering a wide range of data analytics activities. The EOS instance at JRC has currently a gross capacity of 34 PB with an additional increase planned throughout 2024.

        The presentation will give an overview about EOS as storage back-end of the Big Data Analytics Platform. It covers the general setup and current status, experiences made, issues discovered, and an outlook of planned activities and changes in 2024.

        Speaker: Armin Burger (EC Joint Research Centre)
      • 16:10
        EOS Status at IHEP 15m

        The Institute of High Energy Physics (IHEP) has been utilizing the EOS storage system since 2016, with the current capacity nearing 60PB. We use EOS to provide both disk and tape storage solutions for HEP experiments. This report will introduce some of work we carried out in 2023, which include:
        1. The adoption of EOS as SE for the WLCG Tier2, replacing the previous DPM system, and the provision of SE and CTA services for the LHCb Tier1 via EOS.
        2. The upgrade of our system from EOS V4 to V5, which addressed the issue of high concurrency demands posed by the LHAASO experiment instance.
        3. The establishment of a new ARM EOS cluster, which has reached a capacity of 2PB.
        Additionally, we have conducted extensive testing and validation of EOS on AlmaLinux. Our plan is to migrate our entire computing cluster to AlmaLinux 9.2 by the end of June this year.

        Speaker: LI Haibo lihaibo
      • 16:25
        EOS at the Fermilab LHC Physics Center 15m

        Fermilab has been running an EOS instance since testing began in June 2012. By May 2013, before becoming production storage, there was 600TB allocated for EOS. Today, there is approximately 13PB of storage available in the EOS instance.

        The LPC cluster is a 4500-core user analysis cluster with 13 PB of EOS storage. The LPC cluster supports several hundred active CMS users at any given time.

        An update of our current experiences and challenges running an EOS instance for use by the Fermilab LHC Physics Center (LPC) computing cluster. Planning and implementation of our upgrade to EOS 5 and moving to Almalinux before EOL for SL7.

        Speaker: Dan Szkola (Fermi National Accelerator Lab. (US))
      • 16:40
        Purdue EOS status report 15m

        Purdue University switched to EOS storage a couple of years back, and we are still exploring/appreciating the benefits of this distributed file-system. In this talk we will give a brief status report and will outline our plans for using EOS at the CMS Tier-2 center at Purdue.

        Speaker: Stefan Piperov (Purdue University (US))
      • 16:55
        Shared EOS instance at JINR 15m

        The
        Joint Institute for Nuclear Research (JINR) utilizes a diverse
        storage ecosystem, encompassing a number of dCache, EOS and Ceph
        storage instances to address varied scientific needs. In this report
        we focus on one of the largest EOS instances at JINR, operational
        since 2019 as a shared storage system for experimental data.
        Presently, it has the capacity of 22 PB and hosts over 7 PB of data
        from numerous experiments and projects. We give an overview of the
        current setup, development plans and share our experience of
        operating it.

        Speaker: Nikita Balashov (Joint Institute for Nuclear Research (RU))
    • 17:10 17:30
      Final Session: Outlook, Roadmap 31/3-004 - IT Amphitheatre

      31/3-004 - IT Amphitheatre

      CERN

      105
      Show room on map