-
Andreas Joachim Peters (CERN)07/03/2022, 09:00
-
Elvin Alin Sindrilaru (CERN)07/03/2022, 09:15
.
-
Dr Maria Arsuaga Rios (CERN)07/03/2022, 09:35EOS Operations
General description of the EOS service @CERN
-
Latchezar Betev (CERN)07/03/2022, 09:55
The ALICE detector and data acquisition system was substantially upgraded for Run3 and beyond. One of the main elements of the upgrade was the O2 processing cluster, which compresses the detector data in real time. The output of the compression is then written to EOS buffer for subsequent asynchronous data processing and archival. The requirements for the EOS storage are substantial: 120GB/sec...
-
Michal Kamil Simon (CERN)07/03/2022, 10:35
General update from XRootD project.
-
Abhishek Lekshmanan07/03/2022, 10:55
std::atomic introduced since C++11 is used as a building block for lock free programming. However while the default flags provide the maximum consistency; the do come with a performance penalty and may not be what you want in all cases. We will look under the hood, at a top level on what the processor sees when an atomic is encountered, the acquire and release semantics, which are...
-
Dr Jaroslav Guenther (CERN)07/03/2022, 11:05
Improving EOS monitoring of finished transfers. Hands-on
eos io statoutput. -
Aritz Brosa Iartza (CERN)07/03/2022, 11:20
Prometheus is a modern, simple and scalable monitoring system with an easy to use query language based in labels. EOS Operators team has developed a fully-functional EOS Prometheus exporter in Golang to monitor all EOS metrics. This includes space, group, node, filesystem, I/O and namespace stats collectors. In this talk, the tool will be showcased and made available to the EOS Community.
-
Michal Kamil Simon (CERN)07/03/2022, 11:30
Presentation on the new recording plug-in that allows I/O sampling and the replay tool.
-
Andreas Joachim Peters (CERN)07/03/2022, 11:45
With 100GE technology and erasure coding we discovered new bottlenecks and challenges. This presentation will recap the state of the art of the ALICEO2 EOS instance and show benchmarks including a real and and replayed physics analysis use case.
-
Enrico Bocchi (CERN)07/03/2022, 12:05
This contribution reports on the recent revamping of ScienceBox: The container-based stack for science with EOS, CERNBox, and SWAN services for Kubernetes-orchestrated clusters.
ScienceBox has been rebuilt from its foundations using modern cloud-native technologies for better service configuration and improved reliability, without compromising on deployment flexibility. Rethinking the whole... -
Dr Maria Arsuaga Rios (CERN)07/03/2022, 15:45EOS Operations
LHC Data Storage: RUN 3 Data Taking Commissioning
-
Erich Birngruber (Austrian Academy of Sciences (AT))07/03/2022, 16:05Sites and Deployments
Update on the setup and operations at the Vienna Tier-2 site.
-
Dan Szkola (Fermi National Accelerator Lab. (US))07/03/2022, 16:25
Fermilab has been running an EOS instance since testing began in June 2012. By May 2013, before becoming production storage, there was 600TB allocated for EOS. Today, there is approximately 13PB of storage available in the EOS instance.
An update of our current experiences and challenges running an EOS instance for use by the Fermilab LHC Physics Center (LPC) computing cluster. The LPC...
-
Stefan Piperov (Purdue University (US))07/03/2022, 16:40
As part of its storage migration plan, the CMS Tier-2 center at Purdue University is preparing an EOS deployment of ~10PB, which will serve as the main Storage Element of the site, as well as a basis for the future Analysis Facility that’s in development at the moment. We adopted a fully containerized approach with Kubernetes, which allows us to better share available hardware resources...
-
Federico Fornari07/03/2022, 16:55
Due to the increasing interest on data management services capable to cope with very large data resources, allowing the future e-infrastructures to address the needs of the next generation extreme scale scientific experiments, the national center of INFN (Italian Institute for Nuclear Physics) dedicated to Research and Development on Information and Communication Technologies (CNAF) and the...
-
Cristian Contescu (CERN)07/03/2022, 17:10
In this talk we will highlight the operational challenges we faced while bringing up a high-throughput EOS instance for the Run 3 ALICE data acquisition. The journey started in 2020 and we are still perfecting the instance to this day.
During this time all storage nodes got migrated from CentOS 7 to CentOS 8 and, later on, CentOS Stream 8, and not without inherent challenges which we are... -
Elvin Alin Sindrilaru (CERN)07/03/2022, 17:30
.
-
Sang Un Ahn (Korea Institute of Science & Technology Information (KR))08/03/2022, 09:00
This is going to be a brief presentation regarding the operation status of Custodial Disk Storage (CDS) system provided for the ALICE experiment as a Tape. The CDS system is basically using EOS with its erasure coding implementation (RAIN) for the data protection. The CDS joined the WLCG Tape Challenges in the previous year and about a PB of data has been transferred from the experiment. A...
-
Dr Emmanouil Vamvakopoulos (Université Paris-Saclay (FR))08/03/2022, 09:15
In this communication, we are going to present the deployment project of the EOS storage software solution at the GRIF site. GRIF is a distributed site made of four (4) different subsites, in different locations of the Paris region. The worst network latency between the subsites is within 2-4 msec with 3 of them connected with a 100G connection. The objective is to consolidate the four (4)...
-
Armin Burger (JRC)08/03/2022, 09:35
The Joint Research Centre (JRC) of the European Commission is running the Big Data Analytics Platform (BDAP) to enable the JRC projects to process and analyze a wide range of data, providing knowledge and insights in support of EU policy making.
EOS is the main storage system of the BDAP for scientific data. It is in use at JRC since 2016. The gross capacity of 20 PB is currently in the...
-
Abhishek Lekshmanan08/03/2022, 09:55
This is a talk introducing the GroupBalancer and what it does. We also cover about the current in place GroupBalancer improvements introduced from 4.8.78 release, the ways to configure this for deployments, some figures from existing deployements and what the roadmap for the future holds with these functionalities.
-
Dr Jaroslav Guenther (CERN)08/03/2022, 10:15
Migrating the AMS experiment data from EOSPUBLIC to EOSAMS02 stimulated development of tools which might be useful in general for similar exercises in the future. We will show the work in progress.
-
Andreas Joachim Peters (CERN)08/03/2022, 10:55
In preparation for Run-3 we have faced the following problem: we have to balance the usage of IO resources between individual activities, which has led to the implementation of IO priorities and bandwidth regulation policies. While commissioning the ALICEO2 EOS instance we have observed, that write performance using the buffer cache is a bottleneck on storage nodes. Direct IO helps to improve...
-
Andreas Joachim Peters (CERN)08/03/2022, 11:10
With XRootD5 the on the wire protocol provides confidentiality of data inside the transport layer. However data files are human readable on storage nodes and can be accessed and downloaded by any EOS administrator and any person with read access. Filesystem level encryption on storage nodes does not solve this confidentiality problem.
To provide better data privacy the most recent versions... -
Andreas Joachim Peters (CERN)08/03/2022, 11:25
Physics and CERNBOX instances at CERN are exposed to O(4) mount clients simultaneously. Overloads from batch access is not a new thing - since years the AFS filesystem suffers more or less frequently volume overloads. During overload episodes meta-data access at the MGM slows down significantly because thousands of batch nodes compete against few interactive clients and sync & share access. To...
-
Michal Kamil Simon (CERN)08/03/2022, 11:35
A primer on xrdcp new (and old) features like zip append, metalling support, retries and many more.
-
Gregor Molan (Comtrade 360's AI Lab)08/03/2022, 11:45
Context: Productisation of Windows native connection of EOS to Windows operating system.
Objectives: The professional implementation of the EOS with the Windows platform should allow seamless usage of EOS as a Windows local disk with all the EOS benefits, as it is low latency, high throughput, and high reliability.
Method: Implementation of the EOS client for the Windows...
-
Manuel Reis (Universidade de Lisboa (PT))08/03/2022, 12:00
EOS durability machinery is a set of (operator's) scripts, tools and EOS components to classify, monitor and repair unhealthy files. EOS filesystem check (fsck) was enabled in 2021, but one should keep track of the instances' state, and investigate root causes for the problems found.
-
Hugo Gonzalez Labrador (CERN)08/03/2022, 15:30
CERNBox is key enabler service built on top of EOS for users at CERN and beyond. The service is used by more than 37K users and stores over 15PB of data, representing all the user communities at the laboratory.
In this talk we will explain the current status of the service, the challenges we faced in 2021 and our vision for the future: CERNBox as the gateway for a federation of...
-
Roberto Valverde Cameselle (CERN)08/03/2022, 15:50
EOS provides the backend to CERNBox, the cloud sync and share service implementation used at CERN. EOS for CERNBox is storing 12PB of user and project space data across 9 different instances running in multi-fst configuration. This presentation will give an overview of 2021 challenges, how we tried to address them and talk about the roadmap for the service for 2022.
-
Gianmaria Del Monte (CERN)08/03/2022, 16:05
More than 300 million CERNBox files are processed daily using cback backup tool, which ensures that files are safely stored in a different geographical area and using a different storage backend. The backup tool has not stop evolving and was extended to support CephFS mount backup along with EOS mounts under the same infrastructure. This talk will present the current status of the project...
-
Roberto Valverde Cameselle (CERN)08/03/2022, 16:20
The CERNBox service is currently backed by 13PB of EOS storage distributed across more than 3,000 drives. EOS has proven to be a reliable and highly performing backend throughout. On the other hand, the CERN Storage Group also operates CephFS, which has been previously evaluated in combination with EOS as a potential solution for large scale physics data taking [1]. This work seeks to further...
-
Andreas Joachim Peters (CERN)08/03/2022, 16:35
To consolidate the concept of sharing implemented inside EOS for any access protocol we are currently adding a new type of ACL which defines a 'share'. One of the new characteristics of a share ACL is that they are not influenced by POSIX or classic ACLs. We support additional ACL capabilities as 'can share'.
A second important new concept is the concept of ownership by an EGROUP. Ownership... -
Sami Mohamed Chebbi (CERN)08/03/2022, 16:50
EOS provides a very detailed log system which provides useful information of all the user and system operations that are performed at any time. Each EOS daemon has its own log file and tracing operations that involve different components can be a time consuming task (MGM -> FST1 -> FST2). With Grafana Loki and Promtail, we setup a logging aggregation system that allows tracing operations...
-
Aritz Brosa Iartza (CERN)08/03/2022, 17:0510 Minutes
In this talk we present the evolution of the CERNBox Samba service that we operate in front of EOS. An important recent change is the adoption of a new layout based on bind mounts: this allows to operate a smaller number of EOS mounts and to enable federating multiple EOS instances in a single namespace. We will discuss further measures adopted to address the ever increasing load from the...
-
Andreas Joachim Peters (CERN)08/03/2022, 17:15
Understanding the configuration and logic used by eosxd on /eos/ is not straight forward in particular in containerized environments. This short presentation tries to explain the basics.
-
Ishank Arora (CERN)08/03/2022, 17:20
Access to CERNBox via social account providers and external emails provides a highly scalable and traceable mechanism to allow sharing of data and knowledge with people external to CERN, and encourage collaboration across boundaries and institutes. In this talk, we'll talk about how we adapted our service to accommodate such accounts with restricted scopes and describe the developments that...
-
Giuseppe Lo Presti (CERN)08/03/2022, 17:35
This contribution illustrates how we have evolved file locking in CERNBox and EOS. Initially introduced to support Office online applications, the functionality has been extended to be an integral part of Reva, the engine powering CERNBox. We will describe the implementation in the EOS storage system, and the foreseen extensions to cover Linux file locks (flocks) as supported for FUSE and...
-
Oliver Keeble (CERN)09/03/2022, 08:55
Introduction to the CTA session.
-
Mr Denis Lujanski Not Supplied09/03/2022, 09:05
In this presentation, we will report on how we at AARNet deployed CTA along with restic backup client as a backup/ archive solution for our production EOS clusters. The solution has been in production since late 2021. This presentation will aim to cover why we chose CTA, how CTA is deployed, and how it is integrated into our backup workflow.
-
Yujiang Bi (Institute of High Energy Physics, Chinese Academy of Sciences)09/03/2022, 09:20
EOS is now the main Storage System for IHEP experiments like LHAASO and JUNO. And Castor has been used for backup experiment data for a long time at IHEP, and has difficulty to satisfiy data backup requirement of new experiments like LHAASO, JUNO. As EOSCTA became stable to replace Castor in production, we started EOSCTA evaluation and the castor migration. In this talk, we will give a brief...
-
Michael Davis (CERN)09/03/2022, 09:35
CTA entered into production at CERN in 2020 and physics data taking into CTA started in July 2021. 2022 will see the start of LHC Run-3, with combined experiment data rates up to 40 GB/s. This presentation will give an overview of CTA's preparation and readiness for the upcoming Run, as well as a look forward to software features in the development pipeline.
-
Julien Leduc (CERN)09/03/2022, 09:5520 Minutes
An EOSCTA instance is an EOS instance commonly called a tape buffer configured with a CERN Tape Archive (CTA) back-end.
This EOS instance is entirely bandwidth oriented: it offers an SSD based tape interconnection, it can contain spinning disks if needed and it is optimized for the various tape workflows.
This talk will present how to enable EOS for tape using CTA and the Swiss horology... -
Volodymyr Yurchenko (CERN)09/03/2022, 10:35CTA
CTA uses access mechanism provided by EOS and adds tape-specific layer. If one of these elements is misconfigured, a user won't be able to read a file, or, on the contrary, unauthorized access can be granted.
This talk explains how the combination of the ACL, Unix permissions and mount rules works in CTA. We show which tools we use for the permissions management and what are capabilities...
-
Jorge Camarero Vera (CERN)09/03/2022, 10:50
Explanation of the CTA Tape Drive status during a data transfer session.
-
Miguel Barros (Universidade de Lisboa (PT))09/03/2022, 11:05
This talk sumarizes the new file restoring feature of CTA, how it works, how to configure it, when it should be used and it's current limitations.
-
Richard Bachmann (CERN)09/03/2022, 11:20
This presentation summarizes the current effort to detect, and therebye subsequenly remedy, inconsistencies in the file metadata stored on EOS and CTA.
We show how we combine and validate EOSCTA namespaces in order to produce a summary of healthy files for experiments and a troubleshooting tool for operators. -
Ren Bauer (Fermi National Accelerator Lab. (US))09/03/2022, 16:00CTA
Fermilab is the primary research lab dedicated to particle physics in the United States and also is home to the largest archival HEP data store outside of CERN. Fermilab currently employs a HSM based on Enstore, a Fermilab product, and dCache, for tape and disk, respectively. This Enstore+dCache HSM manages nearly 300 PB of active data on tape. Because of the necessary development work to...
-
Cedric Caffy (CERN)09/03/2022, 16:20CTA
Imagine a world where SRM is no longer needed to dialog with tape storage systems. A world where only one standard protocol can be used across the entire WLCG to access tape storage systems.
This dream will soon become reality on EOS...
After several discussions about the specifications of the new WLCG tape REST API, a prototype of the final API has been developed in EOS.
In order to...
-
Dr George Patargias (STFC)09/03/2022, 16:40
This talk will present details of the deployment of Antares, the EOS-CTA service at RAL Tier-1, which replaces Castor.
-
Mr Tigran Mkrtchyan (DESY)09/03/2022, 17:00
The ever increasing amount of data that is produced by modern scientific facilities like EuXFEL or LHC puts a high pressure on the data management infrastructure at the laboratories. This includes poorly shareable resources of archival storage, typically, tape libraries. To achieve maximal efficiency of the available tape resources a deep integration between hardware and software components...
-
Michael Davis (CERN)09/03/2022, 17:20CTA
CTA uses the same tape format as CASTOR. There is interest from the community in adding support to read (but not write) tapes in alternate formats, such as OSM and Enstore. The main use case is to allow sites to migrate from their existing tape storage system to CTA without needing to physically repack all of their tapes.
This BoF session will be a round-table for stakeholders with an...
-
Michal Kamil Simon (CERN)10/03/2022, 09:00Erasure Encoding
Report on the latest tests done at SLAC with the native XRootD EC library.
-
Dr Andrea Sciabà (CERN)10/03/2022, 09:20
Physics analysis is done at CERN in several different ways, using both interactive and batch resources and EOS for data storage. In order to understand if and how the CERN computer centre should change the way analysis is supported for Run3, we performed several performance studies on two fronts: measuring the performance and utilisation levels of EOS with respect to the current analysis...
-
Andreas Joachim Peters (CERN)10/03/2022, 09:45
This presentation will introduce the roadmap for EOS5 during the Run-3 period.
-
10/03/2022, 10:05
Choose timezone
Your profile timezone: