XRootD and FTS Workshop @ JSI

Europe/Zurich
Jozef Stefan Institute

Jozef Stefan Institute

Jamova cesta 39, 1000 Ljubljana, Slovenia
Andrej Filipcic (Jozef Stefan Institute (SI)), Andrew Bohdan Hanushevsky (SLAC National Accelerator Laboratory (US)), Jan Jona Javorsek (Jozef Stefan Institute, Slovenia), Luca Mascetti (CERN), Matevz Tadel (Univ. of California San Diego (US)), Mihai Patrascoiu (CERN)
Description

The XRootD and FTS workshop brings together the XRootD and FTS developers and people from Academia, Research, and Industry to discuss current and future data access activities as related to the XRootD framework and to the FTS project. 

 

Presentations focus on achievements, shortcomings, requirements and future plans for both XRootD and FTS projects.

 

This year we decided to bundle together these two partially overlapping communities. during the workshop there will be additional sessions focusing on review of backend filesystems, on experiments' plans for the LHC Run3 data analysis and remote data access, and on support for efficient remote data-access for non-HEP VOs.

 

Registration will open shortly, three months before the event date. The cost for the workshop is 80 EUR for the five days, it will include lunches and coffee breaks. There will be the possibility to join the dinner reception for 40 EUR (alcoholic beverages not included).

 

Programme:

  • FTS project status
    • Ongoing activities
    • Site reports
  • FTS Communities and Collaborations
  • FTS Monitoring
  • FTS Future directions and planning
    • Community discussion session
  • XRootD developer session
    • Status & new developments
    • Detailed development information for Release 5
  • XRootD & XCache user session:
    • Users share their experience using XRootD and XCache
  • Overview of filesystem backends and their status: EOS, XrootD, dCache
  • ALICE, ATLAS & CMS - plans for data-analysis and remote data access for LHC Run3
  • Providing storage for non-HEP VOs: OSG StashCache, EU-based similar projects

 

About the venue:

Ljubljana, the capital of Slovenia, is a central European city with all the facilities of a modern capital, yet it has preserved its small-town friendliness and relaxed atmosphere. Ljubljana is a city with numerous green areas, which offer excellent opportunities for sports and recreation. The city, with almost 280,000 inhabitants, caters to everyone's needs, since despite the fact that it is one of the smallest European capitals, it strives to provide all the facilities of a metropolis.

Ljubljana is set midway between Vienna and Venice on the crossroads of main European routes, so it is an ideal starting point for visits to many central European cities and countries. Both skiing resorts, attractive in winter, and the Adriatic coast, perfect for summer trips, are only a short distance from Ljubljana.

Travel to Ljubljana https://www.visitljubljana.com/en/visitors/travel-information/

Contact Number: secretariat +386 1 477 3742

    • Registration
    • Lunch
    • FTS - State of Affairs
      • 1
        Welcome

        Welcome talk: workshop logistics, Ljubljana survival guide and an introductory word from the head of the institute

        Speaker: Jan Jona Javorsek (Jozef Stefan Institute, Slovenia)
      • 2
        FTS3: State of affairs

        Last year in review, showcasing the evolution of the FTS project, as well as touching on what's new in the FTS world, community engagement and the future direction.

        Speaker: Mihai Patrascoiu (CERN)
      • 3
        Tape, REST API and more

        This talk will present recent QoS improvements, go into the details of the Tape REST API and how it is implemented in FTS & Gfal2, showcase Gfal2 tape interaction over HTTP and finally, look at what's upcoming in the tape world, such as Archive Metadata and Tape REST API evolution.

        Speaker: Joao Pedro Lopes
    • Coffee break
    • FTS - State of Affairs
      • 4
        FTS & Tokens

        This talk will describe the future strategy of tokens in FTS, as well as implementation milestones to fully integrated tokens into the FTS landscape.

        Speaker: Shubhangi Misra
    • FTS - Site Reports
      • 5
        FTS3 @ CERN

        The FTS3 @ CERN site report, presenting the number of instances, volume of data served each year, database setup and various operation tips and tricks discovered throughout the years.

        Speaker: Steven Murray (CERN)
      • 6
        FTS3: BNL Deployment

        An overview of the FTS3 deployment at BNL

        Speaker: Hironori Ito (Brookhaven National Laboratory (US))
      • 7
        Update on FTS at RAL

        The File Transfer Service (FTS3) is a data movement service developed at CERN, designed to move the majority of the LHC’s data across the WLCG infrastructure. Currently, the Rutherford Appleton Laboratory (RAL) Tier 1 runs two production instances of FTS, serving WLCG users (lcgfts3), and the EGI community (fts3egi). During this talk, we are going to present the status of these production instances at the RAL Tier 1 site, as well as changes and developments planned for FTS at RAL over the next year.
        The first of the planned changes is in relation to RAL’s involvement with the Square Kilometre Array (SKA) experiment, and the UK SKA Regional Centre (UKSRC). Here we are engaged with helping and designing their networking and data transfer requirements, which will begin with the deployment of a SKA FTS instance, so we can begin the testing on their requirements. The second change is the planned integration of token authentication/authorization methods, which aims to improve accessibility to the service to both our existing and new users’ communities. Testing is currently underway on integrating our EGI instance with EGI Check-in, and we intend for the SKA instance to integrate with INDIGO IAM once it is deployed.

        Speaker: Rose Cooper
      • 8
        FTS3 at FNAL (virtual)
        • Outline
        • Introduction
        • Configurations
          • CMS configuration – physical server
          • Public configuration – containers
        • Differences
        • Advantages & disadvantages of each configuration
        • Summary
        Speakers: Lorena Lobato Pardavila (Fermi National Accelerator Lab. (US)), Lorena Lobato Pardavila
    • FTS - Communities and Collaborations
      • 9
        FTS Community Talk: ATLAS (virtual)

        The ATLAS view on data management and FTS involvement

        Speaker: Mario Lassnig (CERN)
      • 10
        FTS Community Talk: CMS

        This presentation will describe the usage of FTS by the CMS experiment at the Large Hadron Collider during the start of Run-3. I will describe the particular features recently developed for, and employed by CMS for our unique user case as well as current challenges and efforts to optimise performance on the boundary between FTS and Rucio. I will also discuss the future transfer requirements of CMS.

        Speaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))
      • 11
        FTS Community Talk: LHCb (virtual)
        Speaker: Ben Couturier (CERN)
    • Coffee break
    • FTS - Communities and Collaborations
      • 12
        EGI Data Transfer Activities (virtual)

        The talk will focus on the activities in EGI related to Data transfer and orchestration, in particular focusing on integration with EGI Check-in AAI in the context of the EGI-ACE project and the new EOSC Data transfer service in EOSC Future project. An overview of the new EGI lead project interTwin will be also given and the role FTS has there in the infrastructure supporting Scientific Digital Twins.

        Speaker: Andrea Manzi
      • 13
        The journey of a file in Rucio

        This talk focuses on the Rucio data management framework and its interaction with FTS.

        Speaker: Radu Carpa (CERN)
      • 14
        FTS-Alto project (virtual)

        An overview of the FTS-Alto project in collaboration with Dr. Richard Yang and his research group (Yale University)

        Speaker: Y. Richard Yang
    • Lunch
    • Coffee break
    • FTS - Monitoring
      • 15
        FTS3: The Monitoring Zoo

        The word "monitoring" is used everywhere in the FTS world. This talk wants to dive into the different types of monitoring present in the FTS world and explain what each of them means.

        Speaker: Joao Pedro Lopes
      • 16
        FTS3@CERN: Service Health Monitoring

        This talk will show an overview of the health and alarm metrics used at the FTS3@CERN deployment. The full lifecycle will be presented, from the software changes and scripts needed, to logging extraction via FluentBit and ultimately to the Grafana display.

        Speaker: Mihai Patrascoiu (CERN)
      • 17
        FTS-Noted Project (virtual)

        An overview of the FTS-Noted project, aimed at shaping traffic through dynamic network switches.

        Speakers: Edoardo Martelli (CERN), Maria Del Carmen Misa Moreira (CERN)
    • FTS - Open Discussion
    • Coffee break
    • The Great Buffer
    • Meet the developers: FTS, XRootD, ROOT, EOS, OSG: and discuss future needs
    • Registration: XRootD Workshop Registration
    • Lunch
    • XRootd presentations
      • 18
        Welcome

        Welcome and logistics

        Speaker: Jan Jona Javorsek (Jozef Stefan Institute, Slovenia)
      • 19
        XRootD Features

        We will review the new XRootD features added since the last workshop.

        Speaker: Andrew Bohdan Hanushevsky (SLAC National Accelerator Laboratory (US))
      • 20
        What's up with the XRootD client
        Speaker: Michal Kamil Simon (CERN)
    • Coffee break
    • XRootd presentations
      • 21
        XRootD Release Schedule and Future Plans
        • Current release procedure/automation
        • Discussion on development workflow
        • Plans for 5.6 and 6.0 releases later this year
        • Python bindings (drop Python2 for good, packaging work)
        Speaker: Guilherme Amadio (CERN)
      • 22
        Evolution of XRootD Testing and CI Infrastructure
        • Recent CI developments (+Alpine, +Alma, -Ubuntu 18)
        • Supported platforms and compilers
        • Full (or almost full) migration from GitLab CI to GitHub Actions
        • Test coverage and static analysis
        • Plans for improving the docker-based tests, running them in CI
        Speaker: Guilherme Amadio (CERN)
      • 23
        OU XRootD Site Report
        Speaker: Horst Severini (University of Oklahoma (US))
      • 24
        XRootD usage at GSI
        Speaker: Soren Lars Gerald Fleischer (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE))
      • 25
        Analysis of data usage at BNL
        Speaker: Hironori Ito (Brookhaven National Laboratory (US))
      • 26
        XRootD in the UK: ECHO at RAL-LCG2 and developments at Tier-2 sites

        ECHO is the Ceph-backed erasure-coded object store, deployed at the Tier-1 facility RAL-LCG2. It’s frontend access to data is provided via XRootD - using the XrdCeph plugin via the libradosstriper library of Ceph, with a current usable capacity in excess of 40PB.
        This talk will cover the work and experiences of optimising for, and operating in, Run-3 of the LHC, and the developments towards future Data Challenges and needs of HL-LHC running.
        In addition, a summary of the XRootD activities in the UK is presented, including the ongoing migrations of a number of Tier-2 sites from DPM to a CephFS+XRootD storage solution.

        Speaker: James William Walder (Science and Technology Facilities Council STFC (GB))
    • Workshop Dinner: Gostilna AS [Čopova ulica 5a, Ljubljana, 1000, Slovenia]
    • XRootd presentations
      • 27
        Open Science Data Federation - OSDF

        All research fields require tools to be successful. A crucial tool today is the computer. The Open Science Grid (OSG) provides ways to access computational power from different sites. Open science data federation (OSDF) provides data access to the OSG pool using several software stacks. OSDF has received upgrades related to storage space, monitoring checks, monitoring stream collection, and new caches. New monitoring systems provide a way to detect a problem before the user; a new cache can provide more data to the users, new origins create more storage available, and new monitoring streams enable a sophisticated debug model. All these improvements create a new way to provide data to OSG and others. The OSDF is receiving many investments and will create more ways to provide scientific data.

        40 minutes presentation

        Speaker: Fabio Andrijauskas (Univ. of California San Diego (US))
      • 28
        To the OSPool and Beyond: The guts of the OSDF client

        The Open Science Data Federation (OSDF) delivers petabytes of data each month to workflows running on the OSPool. To do so, one requires a reliable set of client tools. This presentation will take a look "under the hood" of the current OSDF client tooling, covering:

        • Discovery of nearby cache instances.
        • Acquisition of credentials for transfer, automated or otherwise.
        • Experiences maintaining the client in Go.
        • Integration with the HTCondor Software Suite.
        • Monitoring and telemetry of performance.

        Finally, we'll cover the how we plan to make the client more usable, especially in applications beyond the OSPool, over the coming year.

        Speaker: Brian Bockelman (Morgridge Institute for Research)
    • Coffee break
    • XRootd presentations
      • 29
        XCache Developments & Plans
        Speaker: Matevz Tadel (Univ. of California San Diego (US))
      • 30
        Experience with XCache in Virtual Placement

        Virtual Placement is a way to approximate a CDN-like network for the ATLAS experiment. XCache is an important component in a Virtual Placement mechanism and is expected to substantially improve performance and reliability, while simultaneously decreasing bandwidth needed. I will explain how we configure it, deploy and use it, share experience in more than one year of running it.

        Speaker: Ilija Vukotic (University of Chicago (US))
      • 31
        Experience deploying xCache for CMS in Spain

        Over the last few years, the PIC Tier-1 and CIEMAT Tier-2 sites in Spain have been exploring XCache as a content delivery network service for CMS data in the region. This service aligns with the WLCG data management strategy towards HL-LHC. The caching mechanism allows data to be located closer to compute nodes, which has the potential to improve CPU efficiency for jobs, especially for frequently accessed data. Additionally, since many CMS jobs read data from remote sites using XRootD redirectors, there is significant room for improvement using this technology. We have successfully deployed XCache services at both the PIC and CIEMAT sites, and have configured them to cache popular CMS data based on ad-hoc data access popularity studies. Additional previous verification process revealed that there is no significant degradation in CPU efficiency for non I/O intensive tasks reading data from either site in the region, despite the distance between the two sites being 600km with 9ms latency. Hence, a single cache scenario for the region has been studied, with the cache placed at the PIC Tier-1 and serving data to both sites. This presentation aims to highlight our deployment experience and the benefits we have seen from using XCache in the region, as well as potential future use cases in the context of our data management strategy.

        Speaker: Carlos Perez Dengra (PIC-CIEMAT)
      • 32
        Getting the most out of XCache
        Speaker: Ilija Vukotic (University of Chicago (US))
    • Lunch
    • XRootd presentations
      • 33
        Data-Aware Scheduling for Opportunistic Resources (with XRootD and HTCondor)

        In the talk, I want to present our ideas for a data-aware scheduling mechanism for our opportunistic resources attached to GridKa, the T1 center in Germany.
        Opportunistic resources are non permanent computing sites (partly with cache storages) distributed in Germany that provide resources for the HEP community from time to time.
        We are planning to implement a hash-based distribution of datasets to different resources, inspired by Ceph/CRUSH, in combination with HTCondor scheduling and XRootD caching.
        This will enable us to to schedule jobs to the according site without the need of a separate data management.

        Speaker: Robin Hofsaess (KIT - Karlsruhe Institute of Technology (DE))
      • 34
        Kingfisher: Storage Management for Data Federations
        Speaker: Brian Paul Bockelman (University of Wisconsin Madison (US))
      • 35
        XRootD pgRead & pgWrite
        Speaker: Andrew Bohdan Hanushevsky (SLAC National Accelerator Laboratory (US))
      • 36
        XrdEc: the whole story
        Speaker: Michal Kamil Simon (CERN)
    • Coffee break
    • XRootd presentations
      • 37
        XRootD Plugins
        Speaker: Andrew Bohdan Hanushevsky (SLAC National Accelerator Laboratory (US))
      • 38
        Porting of XRootD to Windows as a part of EOS-wnc

        XRootD provides fast, low latency, and scalable data access. It also provides a hierarchical organization of a filesystem-like namespace organized as a directory. As part of CERN EOS, XRootD assures another possibility for a fast connection for data transfer between the client and the EOS FST.

        This is the presentation of Comtrade's work at the CERN's project of productization of EOS, and it is the presentation of XRootD porting to Windows as a part of the EOS client porting from Linux to Windows. All functionalities of the EOS client ported on Windows should ultimately be the same as those on Linux. XRootD is a part of the EOS client implementation on Linux and the first approach is to port the XRootD to provide EOS implementation on Windows. To make the best use of all the advantages and possibilities of Windows, the transfer of XRootD to Windows is designed to support the functionalities of XRootD and not to transfer the original code from Linux to Windows.

        XRootD implementation on Linux is technically investigated as a group of components to port EOS client functionalities from Linux to Windows adequately. The list of external libraries is presented for each of these components. Presented is the list of the majority of Linux libraries used in XRootD, where there are Windows alternatives. If the porting of the XRootD to Windows is limited to essential functionalities, the most important is the port of the xrdcp binary to Windows. Except for networking and security, appropriate libraries for Windows are available for all other functionalities.

        According to the determined missing Windows libraries for network and security, network and security should be either implemented on Windows as part of xrdcp or we should provide a Windows version of these libraries. Within a collaboration between CERN openlab and Comtrade, Comtrade invested and provided a port of XRootD and xrdcp binary with no encoded connection (security). Based on Comtrade's estimation, the investment needed for porting missing XRootD libraries to Windows is out of the scope of Comtrade internal investments for XRootD. To complete this implementation, an appropriate outside investment is needed. The final result will be the complete port of the XRootD to Windows. Finally, porting XRootD to the Windows platform would bring additional possibilities for using Windows for particle physics experiments.

        Speaker: Gregor Molan (Comtrade 360's AI Lab)
      • 39
        A Brief History of the dCache Xroot Implementation (virtual)
        Speaker: ALBERT ROSSI (Fermi National Accelerator Laboratory)
    • XRootd presentations
      • 40
        RNTuple: ROOT's Event Data I/O for HL-LHC

        This talk provides an introduction to RNTuple, ROOT's designated TTree successor. RNTuple is active R&D, available in the ROOT::Experimental namespace. Benchmarks using common analysis tasks and experiment AODs suggest a 3x - 5x better single-core performance and 10%-20 smaller files compared to TTree. The talk will specifically focus on RNTuple's I/O scheduling and optimization opportunities for remote reading with XRootD.

        Speaker: Jakob Blomer (CERN)
      • 41
        LHCOPN/LHCONE Status and Updates

        In this talk we’ll give an update on the LHCOPN/LHCONE networks, current activities, challenges and recent updates. We will also focus on the various R&D projects that are currently on-going and could impact XRootD and FTS. Finally, we will also cover our plans for mini-challenges and major milestones in anticipation of the DC24.

        Speakers: Edoardo Martelli (CERN), Marian Babik (CERN)
      • 42
        Kubernetes and XrootD

        Bioscience, material sciences, physics, and other research fields require several tools to achieve new results, discoveries, and innovations. All these research fields require computation power. The Open Science Grid (OSG) provides ways to access the computation power from different sites for several research fields. Besides the processing power, it is essential to access the data for all simulations, calculations, and other kinds of processing. To provide data access to all jobs on the OSG, the Open Science Data Federation (OSDF) have ways to create the required data access. The primary way to provide data on OSDF is the XrootD on a Kubernetes infrastructure on the National Research Platform. This work aims to show if there is any overhead using XrootD in a Kubernetes environment. To test this, we set an XrootD origin on bare metal and an XrootD origin using Kubernetes on the same host and request files using files size 500MB, 1GB, and 10GB. The results show a 2% larger performance on the transfer rate using bare metal than Kubernetes XrootD origin. In conclusion, there is no statistical difference between XrootD running on Kubernetes or bare metal.

        10 minutes presentation

        Speaker: Fabio Andrijauskas (Univ. of California San Diego (US))
    • Coffee break
    • XRootd presentations
    • Workshop wrap-up
      • 46
        Don't be a Stranger, Please! :-)
        Speaker: Michal Kamil Simon (CERN)
      • 47
        Many Thanks & Future Outlook
        Speaker: Andrew Bohdan Hanushevsky (SLAC National Accelerator Laboratory (US))
    • 12:30
      Lunch