EOS Workshop

Name: EOS Workshop
Start: 2017-02-02T09:00:00+01:00
End: 2017-02-03T17:00:00+01:00
Location: CERN

2 Feb 2017, 09:00 → 3 Feb 2017, 17:00 Europe/Zurich

31/3-004 - IT Amphitheatre (CERN)

31/3-004 - IT Amphitheatre

CERN

105

Show room on map

Description

The first EOS workshop is in preparation to bring together the EOS community.

The two days event at CERN is organized to provide a platform for exchange between developers, users and sites running EOS.

Outline

The EOS development teams will present the current state of the art, best practices and the future road map.

We aim to discuss in particular architecture and status of

a new high-available scale-out namespace
EOS as a filesystem

and three associated core projects

XRootD as the core platform of EOS
CERNBOX for Sync&Share
CTA for tape archive integration & the EOS workflow engine

We warmly invite sites to present their current deployment, operational experiences, possible future deployment plans and input for improvements.

We encourage experiment representatives to share their view on future in (physics) storage and their usage of EOS.

Hands-on sessions are foreseen to enable exchange of information between operation teams at different sites and the development teams.

The first day will finish with a social dinner (at your expenses).

Fees

The workshop participation will be without fee. Free coffee breaks will be provided.

If you are interested in joining the EOS community - this is the perfect occasion!

Please register yourself to the workshop here. Don't forget to submit an abstract if you want to share your experience/ideas within the EOS community.

We hope to see many of you in February!

Your CERN EOS team.

Registration

EOS Workshop registration form

Participants

77 View full list

Thursday 2 February
- Thu 2 Feb
- Fri 3 Feb
- 09:00 → 10:10
  EOS Workshop: Starting Session 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
  - 09:00
    
    Registration/Arrival 30m
  - 09:30
    
    Workshop Introduction 5m
    
    An Introduction to the EOS workshop.
    
    Speaker: Andreas Joachim Peters (CERN)
    
    EOS Workshop Introducation.pdf
  - 09:35
    
    Community and Communication 5m
    
    This is a quick recap of the EOS community, a who-is-who and our communcation platforms.
    
    Speaker: Andreas Joachim Peters (CERN)
    
    EOS Workshop Communication.pdf
  - 09:40
    
    EOS Overview & Aquamarine Production Version 15m
    
    This presentation will give a brief introduction to EOS and the Aquamarine release version.
    
    Speaker: Andreas Joachim Peters (CERN)
    
    EOS Workshop Aquamarine Release.pdf
  - 09:55
    
    EOS Services at CERN 15m
    
    This presentation summarizes the current EOS service deployment at CERN.
    
    Speaker: Herve Rousseau (CERN)
    
    EOS-at-CERN.pdf
- 10:10 → 10:30
  
  Coffee break 20m 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
- 10:30 → 12:00
  EOS Workshop: Use Cases & Experiences 1 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
  - 10:30
    
    CERNBox Service Overview Presentation 20m
    
    Introduction to the CERNBox Service at CERN for those who do not know it yet.
    
    Speakers: Luca Mascetti (CERN), Jakub Moscicki (CERN), Hugo Gonzalez Labrador (CERN)
    
    2017-01-31 CERNBox-EOSworkshop.pdf
  - 10:50
    
    Collaborative Editing in CERNBOX 10m
    
    The Web Application Open Platform Interface (WOPI) protocol allows to integrate collaborative editing with Office Online applications in CERNBOX/EOS.
    
    Speaker: Giuseppe Lo Presti (CERN)
    
    G. Lo Presti - WOPI - EOS workshop 2017.pdf
    
    G. Lo Presti - WOPI - EOS workshop 2017.pptx
  - 11:00
    
    EOS at 6,500 Kilometres Wide 20m
    
    Impressed by CERN’s EOS filesystem providing a POSIX-like capability on top of a tried and tested large scale storage system, AARNet runs multiple EOS clusters, the most interesting being a three site single namespace running replicas across 65ms. This filesystem is used for user data delivered via ownCloud, FileSender and an internally developed tool for fast parallel bundled uploads.
    
    Currently the filesystem is 2 petabytes, split between Perth, Melbourne and Brisbane in Australia, and the current plans are to grow it well into the tens of petabytes within the next year.
    
    AARNet has tried multiple scale out filesystems over the years in order to get true geographic replication and a single namespace across the Australian continent, but so far only EOS has had the capability of delivering within reasonable constaints.”
    
    This talk will cover some of the first hand experiences with running, maintaining, and debugging issues found along the way.
    
    Speaker: Mr David Jericho (AARNet)
    
    David Jericho EOS Workshop Geneva Febuary 2017 EOS at 6500 km.pdf
    
    David Jericho EOS Workshop Geneva Febuary 2017 EOS at 6500 km.pptx
  - 11:20
    
    EOS as storage back-end for Earth Observation data processing at the EC Joint Research Centre 25m
    
    The Copernicus Programme of the European Union with its fleet of Sentinel satellites will generate up to 10 terabyte of Earth Observation (EO) data per day once in full operational capacity. These data, combined with other geo-spatial data sources, form the basis of many JRC knowledge production activities. In order to handle this big amount of data and their processing, the JRC Earth Observation Data and Processing Platform (JEODPP) was implemented. This platform is built upon commodity hardware. It consists of processing servers amounting to a total of currently 500 cores and 8 TB of RAM using 10 Gb/s ethernet connectivity to access the EOS storage back-end. The EOS instance is running on currently 10 storage servers with a total gross capacity of 1.4 petabyte, with a scaling-up foreseen in 2017. EOS was deployed on the JEODPP thanks to the CERN-JRC collaboration. In conjunction with HTCondor as workload manager EOS allows for optimal load distribution during massive processing. The processing jobs are containerised with Docker technology to support different requirements in terms of software libraries and processing tools.
    
    This presentation details the JEODPP platform with emphasis on its EOS instance, using the FUSE client on the processing servers for all data access tasks. Low-level I/O benchmarking and real-world applications of EO data processing tasks show a good scalability of the storage system in a cluster environment. Issues encountered during data processing and service set-up are also described together with their current solutions.
    
    Speakers: Mr Armin Burger (European Commission Joint Research Centre), Mr Veselin Vasilev (European Commission Joint Research Centre)
    
    EOS_Workshop_Geneva_JRC_2017-02-02.pdf
  - 11:45
    
    User and group eos storage management at the CMS CERN Tier-2 15m
    
    A wide range of detector commissioning, calibration and data analysis tasks is carried out by members of the Compact Muon Solenoid (CMS) collaboration using dedicated storage resources available at the CMS CERN Tier-2 centre.
    Relying on the functionalities of the EOS storage technology, the optimal exploitation of the CMS user and group resources has required the introduction of policies for data access management, data protection, cleanup campaigns based on access pattern, and long term tape archival.
    The resource management has been organised around the definition of working groups and the delegation to an identified responsible of each group composition.
    In this contribution we illustrate the user and group storage management, and the development and operational experience at the CMS CERN Tier-2 centre in the 2012-2016 period.
    
    Speakers: Gianluca Cerminara (CERN), Andreas Pfeiffer (CERN), Giovanni Franzoni (CERN)
    
    2017-01-31-eos-T2_CH_CERN_talk.pdf
- 12:00 → 13:30
  
  Lunch break (Rest. 2) 1h 30m 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
- 13:30 → 16:00
  EOS Workshop: A CITRINE Future 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
  - 13:30
    
    XRootD client developements 15m
    
    XRootD is a distributed, scalable system for low-latency file access. It is the primary data access framework for the high-energy physics community and the backbone of EOS. One of the latest developments in the XRootD client has been to incorporate metalink and segmented file transfer technologies. We also report on the implementation of the signed requests and ZIP archive support.
    
    Speaker: Dr Michal Kamil Simon (CERN)
    
    EOS_workshop.pdf
    
    EOS_workshop.pptx
  - 13:45
    
    Lessons Learned / Architectural Evolution 10m
    
    We will review our experience from 5 years of EOS in production and introduce a generic architectural evolution.
    
    Speaker: Andreas Joachim Peters (CERN)
    
    EOS Workshop Architecure Lesson.pdf
  - 13:55
    
    EOS namespace on top of a key-value store 20m
    
    EOS namespace on top of a key-value store
    
    CERN has been developing and operating EOS as a disk storage solution successfully for over 6 years. The CERN deployment provides 140 PB of storage and more than 1.4 billion replicas distributed over two computer centres. Deployment includes four LHC instances, a shared instance for smaller experiments and since a couple of years ago an instance for individual user data as well. The user instance represents the backbone of the CERNBOX service for file sharing. New use cases like synchronisation and sharing, the planned migration to reduce AFS usage at CERN and the continuous growth has brought EOS to new challenges.
    
    Recent developments include the integration and evaluation of various technologies to do the transition from a single active in-memory namespace to a scale-out implementation distributed over many meta-data servers. The new architecture aims to separate the data from the application logic and user interface code, thus providing flexibility and scalability to the namespace component. The aim of the new design is to address the two main challenges of the current in-memory namespace: reducing the the boot-up time of the namespace and removing the dependency between the namespace size and the amount of RAM required to accommodate it. In order to achieve all this, we've developed an in-house solution for the back-end that combines interesting concepts from already existing projects like Redis, RocksDB, XRootD as well as state of the art consensus protocol like Raft.
    
    This presentation details the basic concepts and the technology used to develop the key-value backend and the namespace interface modifications required to accomodate both the old and the new implementations. Futhremore, we explain the necessary steps to configure the new setup and the implications it has on the service availability and deployment process. Last but not least, we present some preliminary performance metrics of the new system that represent the basis for comparison with the request rates that we currenty observe in production.
    
    Speaker: Elvin Alin Sindrilaru (CERN)
    
    EOS_namespace_on_top_of_kv_store.pdf
  - 14:15
    
    quarkdb - a highly-available backend for the EOS namespace 20m
    
    quarkdb will soon become the storage backend for the EOS namespace. Implemented on top of rocksdb, a Key-Value store developed by Facebook, quarkdb offers a redis-compatible API and high availability through replication.
    
    In this talk, I will go through some design decisions of quarkdb, and detail how replication is achieved through the raft consensus algorithm.
    
    Speaker: Georgios Bitzes (CERN)
    
    quarkdb.pdf
  - 14:35
    
    CITRINE Scheduler 20m
    
    The CITRINE release provides a completely re-engineered scheduling algorithm for geographic-aware file placement. The presentation will highlight key concepts of scheduling tree, proxy groups, file sticky-ness ...
    
    Speaker: Geoffray Michel Adde (CERN)
    
    gadde-EosWorkshop2017-geosched.pdf
  - 14:55
    
    EOS as a filesystem 20m
    
    During 2016 the usage and scope of EOS at CERN and outside has been evolving towards usage as a POSIX like filesystem. The presentation will highlight the current state and improvements done during the last year. We will introduce the next generation (reimplementation) of the FUSE client in EOS which might overcome most limitations and non POSIX behaviour adding dedicated server side support to handle leases and client cache management.
    
    Speaker: Andreas Joachim Peters (CERN)
    
    EOS Workshop FUSE.pdf
  - 15:15
    
    Strong Authentication in FUSE 15m
    
    We have added a similar mechanism as AFS/Kerberos to bind user applications to user credentials when interacting with EOS over FUSE. In this presentation we will describe the implementation and its integration into the login mechanism in interactive and batch nodes at CERN.
    
    Speaker: Geoffray Michel Adde (CERN)
    
    gadde-EosWorkshop2017-fuse.pdf
  - 15:30
    
    EOS-FUSE in CERN's DevOps Infrastructure 15m
    
    This talk will present the devops workflow used to validate and deploy new eos-fuse releases to the CERN computing infrastructure.
    
    Speaker: Dan van der Ster (CERN)
    
    EOS-FUSE DevOps.pdf
    
    EOS-FUSE DevOps.pptx
- 16:00 → 16:30
  
  Coffee break 30m 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
- 16:30 → 18:00
  EOS Workshop: hands-on: Your quarkdb and CITRINE 513/1-024
  
  513/1-024
  
  CERN
  
  50
  Show room on map
  - 16:30
    
    quarkdb live setup and demo 15m
    
    In this short presentation, we'll show a live demo of setting up and operating a quarkdb cluster. We demonstrate what happens during a failover, how node "resilvering" works, and show that adding or removing nodes on-the-fly does not impact availability.
    
    Speaker: Georgios Bitzes (CERN)
  - 16:45
    
    CITRINE setup & configuration 1h 15m
    
    In this session we will go through the steps to setup a CITRINE instance with the new quarkdb backend and explain migration from BERYL to CITRINE.
    
    Speakers: Elvin Alin Sindrilaru (CERN), Geoffray Michel Adde (CERN), Georgios Bitzes (CERN)
Friday 3 February
- Thu 2 Feb
- Fri 3 Feb
- 09:00 → 10:30
  EOS Workshop: Use Cases & Experiences 2 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
  - 09:00
    
    Experience with simple cluster setups 20m
    
    The goal of this talk is to share experience installing and using EOS storage at small cluster/sites for local users or students. Simple cluster setup contains FreeIPA (kerberos and ldap) as authentification with own Certificate authority, SLURM queuing system, CVMFS for software distribution, EOS as storage for data and home directories. For easy development and issue tracker GitLab is used. It also includes own indico site for management of meetings. There are 3 clusters installed already (Dubna, Russia - Hybrilit cluster, iThemba Labs and Slovakia - SPSEKE).
    
    Speaker: Martin Vala (Technical University of Kosice (SK))
    
    2017-02-03_CERN_EOS_WORKSHOP_vala.pdf
  - 09:20
    
    A user's perspective - experiences with EOS integration in Invenio digital framework and XRootDPyFS 20m
    
    We will present our recent experiences with integrating EOS into Invenio digital library framework and how EOS allows data repository services such as Zenodo to handle large files in an efficient and scalable manner. Invenio v3, the underlying framework for a number of data preservation repositories such as CERN OpenData, CERN Document Server and Zenodo, was completely rebuilt from ground-up during 2016. In particular, Invenio's file handling layer was completely revamped in order to support multiple storage backends via PyFilesytem library as well as handling of large files. We will present both the Invenio layer and the benefits and obstacles we encounterd using EOS, as well as the XRootDPyFS plugin for PyFilesystem which provides access to EOS over XRootD for any Python application.
    
    Speaker: Lars Holm Nielsen (CERN)
    
    EOS-Invenio.pdf
  - 09:40
    
    EOS Usage at IHEP 20m
    
    There are many large scientific projects in Institute of High Energy Physics (IHEP), such as BESIII, JUNO, LHAASO and so on. These experiments have a huge demand for massive data storage. EOS as an open source distributed disk storage system provides good solution. IHEP now has deployed two EOS instances. One is used for batch computing, and another for public usage (Owncloud+EOS). In this presentation, we will introduce our deployment status, and discuss our future plans in EOS.
    
    Speaker: Haibo li (Institute of High Energy Physics Chinese Academy of Science)
    
    EOS_Workshop_Geneva_EOS_usage_at_IHEP_2017.pdf
    
    EOS_Workshop_Geneva_EOS_usage_at_IHEP_2017.pptx
  - 10:00
    
    Russian Federated Data Storage System Prototype 20m
    
    In our talk we will cover development and implementation of federated data storage prototype for WLCG centers of different levels and university clusters within one Russian National Cloud. The prototype is based on computing resources located in Moscow, Dubna, St.-Petersburg, Gatchina and Geneva. This project intends to implement a federated distributed storage for all kind of operations with access from Grid centers, university clusters, supercomputers, academic and commercial clouds. The efficiency and performance of the system are demonstrated using synthetic and experiment-specific tests including real data processing and analysis workflows from ATLAS and ALICE experiments. We will present topology and architecture of the designed system and report performance and statistics for different access patterns.
    
    Speaker: Mr Andrey Kirianov (Petersburg Nuclear Physics Institut (RU))
    
    EOS FedStor.pdf
- 10:30 → 11:00
  
  Coffee break 30m 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
- 11:00 → 12:00
  EOS Workshop: Workflows, Tape & REST 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
  - 11:00
    
    Workflow Support 10m
    
    A brief introduction to the EOS workflow engine.
    
    Speaker: Andreas Joachim Peters (CERN)
    
    EOS Workshop WFE.pdf
  - 11:10
    
    CTA: the tape backend for EOS 20m
    
    The CERN Tape Archive (CTA) will provide EOS instances with a tape backend. It inherits from the CASTOR's tape system, but will provide a new request queuing system which will allow more efficient tape resource utilization.
    
    In this presentation we will present CTA's architecture and the project's status.
    
    Speaker: Eric Cano (CERN)
    
    CTA - the tape backend of EOS .pdf
    
    CTA - the tape backend of EOS .pptx
  - 11:30
    
    Docker and Kubernetes for CTA 20m
    
    The IT Storage group at CERN develops the software responsible for archiving to tape the custodial copy of the physics data generated by the LHC experiments. This software is code named CTA (the CERN Tape Archive).
    It needs to be seamlessly integrated with EOS, which has become the de facto disk storage system provided by the IT Storage group for physics data.
    
    CTA and EOS integration requires parallel development of features in both software that needs to be synchronized and systematically tested on a specific distributed development infrastructure for each commit in the code base.
    
    This presentation describes the full continuous integration work flow that deploys and orchestrates all the needed services in docker containers on our specific kubernetes infrastructure.
    
    Speaker: Julien Leduc (CERN)
    
    170203_eos_ws_kubernetes-CTA.pdf
    
    System tests using docker and kubernetes
  - 11:50
    
    REST API 5m
    
    An introduction how to use the EOS REST API for management and user interfaces.
    
    Speaker: Andreas Joachim Peters (CERN)
    
    EOS Workshop Rest Api.pdf
  - 11:55
    
    Admin UI 5m
    
    The talk describes a graphical interface under implementation to help administering an EOS cluster.
    
    Speaker: Andrea Manzi (CERN)
    
    EOS-Admin-ui.pdf
    
    EOS-Admin-ui.pptx
- 12:00 → 13:30
  
  Lunch break (Rest. 2) 1h 30m 31/3-004 - IT Amphitheatre
  
  31/3-004 - IT Amphitheatre
  
  CERN
  
  105
  Show room on map
- 13:30 → 15:00
  EOS Workshop: hands-on: CERNBOX and EOS, Workshop Closing 513/1-024
  
  513/1-024
  
  CERN
  
  50
  Show room on map
  - 13:30
    
    Your MINIO - Amazon S3 for EOS 10m
    
    MINIO is a GO implementation of an Amazon S3 compatible server exporting a filesystem with an elegant browser interface. We will demonstrate how you get a user-private AWS S3 compatible server exporting files from your EOS instance in few minutes.
    
    Speaker: Andreas Joachim Peters (CERN)
    
    EOS Workshop Minio.pdf
  - 13:40
    
    Your CERNBOX Setup 45m
    
    Demonstration and deployment of a simplified CERNBox service instance.
    
    Speakers: Hugo Gonzalez Labrador (CERN), Jakub Moscicki (CERN), Luca Mascetti (CERN)
  - 14:25
    
    Roundtable Discussion 30m
    
    Speakers: Andreas Joachim Peters (CERN), Dan van der Ster (CERN), Elvin Alin Sindrilaru (CERN), Geoffray Michel Adde (CERN), Herve Rousseau (CERN), Hugo Gonzalez Labrador (CERN), Kuba Moscicki UNKNOWN (CERN), Luca Mascetti (CERN), Massimo Lamanna (CERN), Xavier Espinal Curull (Universitat Autònoma de Barcelona (ES))
  - 14:55
    
    Closing Remarks 5m
    
    EOS Workshop Closing.pdf

Choose timezone

EOS Workshop

31/3-004 - IT Amphitheatre

CERN

Outline

Fees

31/3-004 - IT Amphitheatre

CERN

31/3-004 - IT Amphitheatre

CERN

31/3-004 - IT Amphitheatre

CERN

31/3-004 - IT Amphitheatre

CERN

31/3-004 - IT Amphitheatre

CERN

EOS namespace on top of a key-value store

31/3-004 - IT Amphitheatre

CERN

513/1-024

CERN

31/3-004 - IT Amphitheatre

CERN

31/3-004 - IT Amphitheatre

CERN

31/3-004 - IT Amphitheatre

CERN

31/3-004 - IT Amphitheatre

CERN

513/1-024

CERN