CernVM Workshop 2022
from
Monday 12 September 2022 (09:00)
to
Tuesday 13 September 2022 (19:30)
Monday 12 September 2022
09:00
CernVM-FS Hacking Session
-
Jakob Karl Eberhardt
(
University of Applied Sciences (DE)
)
Laura Promberger
(
CERN
)
Jakob Blomer
(
CERN
)
CernVM-FS Hacking Session
Jakob Karl Eberhardt
(
University of Applied Sciences (DE)
)
Laura Promberger
(
CERN
)
Jakob Blomer
(
CERN
)
09:00 - 12:00
Room: Z009 "Eulerzaal"
On-site session for discussions and coding. The core team will be present and participants are welcome to join us at the venue.
12:00
Coffee and Sandwiches
Coffee and Sandwiches
12:00 - 14:00
Room: Z009 "Eulerzaal"
14:00
Workshop Introduction
-
Mary Hester
Dennis Van Dok
Jakob Blomer
(
CERN
)
Workshop Introduction
Mary Hester
Dennis Van Dok
Jakob Blomer
(
CERN
)
14:00 - 14:15
Room: Z009 "Eulerzaal"
14:15
CernVM-FS: Status and Plans
-
Jakob Blomer
(
CERN
)
CernVM-FS: Status and Plans
Jakob Blomer
(
CERN
)
14:15 - 14:35
Room: Z009 "Eulerzaal"
14:35
CernVM-FS publisher on Kubernetes
-
Andrea Valenzuela Ramirez
(
CERN
)
CernVM-FS publisher on Kubernetes
Andrea Valenzuela Ramirez
(
CERN
)
14:35 - 14:55
Room: Z009 "Eulerzaal"
14:55
CernVM 5: A fully containerized CernVM
-
Jakob Karl Eberhardt
(
University of Applied Sciences (DE)
)
CernVM 5: A fully containerized CernVM
Jakob Karl Eberhardt
(
University of Applied Sciences (DE)
)
14:55 - 15:15
Room: Z009 "Eulerzaal"
15:15
Coffea Break
Coffea Break
15:15 - 15:45
Room: Z009 "Eulerzaal"
15:45
Exascale data processing with CVMFS
-
Matt Harvey
(
Jump Trading
)
Exascale data processing with CVMFS
Matt Harvey
(
Jump Trading
)
15:45 - 16:05
Room: Z009 "Eulerzaal"
**About the speaker:** Matt Harvey is a Linux production engineer at Jump Trading, working on high performance computing in a global, data-intensive environment. His background is in research computing services, in both the public sector, at Imperial College London, and the pharmaceutical industry with Acellera.
16:05
ALICE's software infrastructure
-
Timo Wilken
(
CERN
)
ALICE's software infrastructure
Timo Wilken
(
CERN
)
16:05 - 16:25
Room: Z009 "Eulerzaal"
16:25
CVMFS usage and performance for LHCb
-
Ben Couturier
(
CERN
)
CVMFS usage and performance for LHCb
Ben Couturier
(
CERN
)
16:25 - 16:45
Room: Z009 "Eulerzaal"
16:45
WLCG Stratum 1 Status and Plans
-
Dave Dykstra
(
Fermi National Accelerator Lab. (US)
)
WLCG Stratum 1 Status and Plans
Dave Dykstra
(
Fermi National Accelerator Lab. (US)
)
16:45 - 17:05
Room: Z009 "Eulerzaal"
17:05
First experiences using CVMFS in Microsoft Azure
-
Hugo Meiland
(
Microsoft
)
First experiences using CVMFS in Microsoft Azure
Hugo Meiland
(
Microsoft
)
17:05 - 17:25
Room: Z009 "Eulerzaal"
**About the speaker** Principal Program Manager at Microsoft, helping customers discover, test and improve HPC technologies on Azure.
18:00
Amsterdam Walking Tour + Dinner
Amsterdam Walking Tour + Dinner
18:00 - 23:00
Tuesday 13 September 2022
09:30
Lazy Container Image Distribution With eStargz And P2P Image Sharing on IPFS
-
Kohei Tokunaga
Lazy Container Image Distribution With eStargz And P2P Image Sharing on IPFS
Kohei Tokunaga
09:30 - 09:50
Room: Z009 "Eulerzaal"
Pulling images is known as one of the time-consuming steps during starting up containers. In this talk, Kohei will share the approach of speeding up container startup using eStargz image format and recent works around eStargz. eStargz allows container runtimes to start containers without waiting for the pull completion. This has been available on a variety of tools in the community, including Kubernetes, k3s, containerd, CRI-O, Podman, BuildKit, Kaniko, etc. The talk will also cover IPFS-based P2P image distribution with containerd and its combination with eStargz-based lazy pulling. **About the speaker** Kohei Tokunaga is a software engineer at NTT Corporation, a Japan-based telecommunication company. He is a reviewer of CNCF containerd and a maintainer of BuildKit. He is also a maintainer of stargz-snapshotter subproject in containerd.
09:50
Harbor Registry at CERN: Status and Enhancements
-
Ricardo Rocha
(
CERN
)
Harbor Registry at CERN: Status and Enhancements
Ricardo Rocha
(
CERN
)
09:50 - 10:10
Room: Z009 "Eulerzaal"
CERN offers a centralized OCI registry at registry.cern.ch, based on the Harbor project and available to the whole community. In addition to the standard container registry functionality, Harbor adds support for any kind of OCI artifact (Helm Charts, ML Models, etc) as well as support for proxy caches to external registries, automated replication between multiple registry instances, vulnerability checks, image signing, webhook integration, among many others. This talk will summarize the status of our deployment, as well as giving more details around recently added functionality including multiple vulnerability checks using different CVE sources and tools and support for sigstore for image signing and verification.
10:10
CSI Driver for CernVM-FS
-
Robert Vasek
(
CERN
)
CSI Driver for CernVM-FS
Robert Vasek
(
CERN
)
10:10 - 10:30
Room: Z009 "Eulerzaal"
10:30
Cvmfs at GSI
-
Soren Lars Gerald Fleischer
(
GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE)
)
Cvmfs at GSI
Soren Lars Gerald Fleischer
(
GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE)
)
10:30 - 10:50
Room: Z009 "Eulerzaal"
10:50
Coffea Break
Coffea Break
10:50 - 11:20
Room: Z009 "Eulerzaal"
11:20
Decarbonizing Scientific Computing
-
Andrew Grimshaw
(
Lancium Compute
)
Decarbonizing Scientific Computing
Andrew Grimshaw
(
Lancium Compute
)
11:20 - 11:40
Room: Z009 "Eulerzaal"
Scientific computing uses a tremendous amount of energy, and given the location of most HPC centers, results in a similarly large amount of CO2 emissions. In the US, for example, in 2019, every MWh of generated power on average led to 0.7 metric tons of CO2 emissions. To address the huge carbon footprint of computing Lancium Compute is building, low carbon, renewable-energy-driven data centers in the Great Plains of the United States. Because the wind does not always blow, and the sun does not always shine, our data centers must be able to rapidly ramp our computing and electrical load up and down in order to balance the electrical grid. Not all applications are suitable for this sort of load management. However, many batch based scientific jobs, e.g., High Throughput Computing jobs, are ideal. In 2021 we began to work with the Open Science Grid to support their HTC load. As part of that integration effort we added support for CVMFS for all containerized jobs. Later work with the US CMS and US ATLAS team led us to further deploy a hierarchical Squid architecture to support Frontier. In this talk I will briefly present electrical grid basics, and explain how the characteristics of renewables make them difficult to integrate into the grid. I follow with a discussion on how controllable loads can solve these problems, and how computing can be an excellent controllable load. I will then describe our quality of service model and multi-site system architecture with CVMFS to support high throughput jobs single node jobs and low-degree parallel jobs at our clean compute campuses in Texas.
11:40
CMS deployments on CernVM-FS
-
Andrea Valenzuela Ramirez
(
CERN
)
CMS deployments on CernVM-FS
Andrea Valenzuela Ramirez
(
CERN
)
11:40 - 12:00
Room: Z009 "Eulerzaal"
12:00
ATLAS Installations on CVMFS
-
Oana Vickey Boeriu
(
University of Sheffield (UK)
)
ATLAS Installations on CVMFS
Oana Vickey Boeriu
(
University of Sheffield (UK)
)
12:00 - 12:20
Room: Z009 "Eulerzaal"
12:20
Lunch Break
Lunch Break
12:20 - 14:00
Room: Z009 "Eulerzaal"
14:00
The European Environment for Scientific Software Installations (EESSI)
-
Kenneth Hoste
The European Environment for Scientific Software Installations (EESSI)
Kenneth Hoste
14:00 - 14:20
Room: Z009 "Eulerzaal"
The European Environment for Scientific Software Installations (EESSI, pronounced as "easy") is a collaboration between different HPC sites and industry partners, with the common goal to set up a shared repository of scientific software installations that can be used on a variety of systems, regardless of which flavor/version of Linux distribution or processor architecture is used, or whether it is a full-size HPC cluster, a cloud environment or a personal workstation, and without compromising on the performance of the software. The EESSI codebase (https://github.com/EESSI) is open source and heavily relies on various other open-source software projects, including Ansible, archspec, CernVM-FS, Cluster-in-the-Cloud, EasyBuild, Gentoo Prefix, Lmod, ReFrame, Singularity, and Terraform. The concept of the EESSI project was inspired by the Compute Canada software stack, and consists of three main layers: - a filesystem layer leveraging CernVM-FS, to globally distribute the EESSI software stack; - a compatibility layer using Gentoo Prefix, to ensure compatibility with different client operating systems (different Linux distributions, macOS, Windows Subsystem for Linux); - a software layer, hosting optimized installations of scientific software along with required dependencies, which were built for different processor architectures, and where archspec, EasyBuild and Lmod are leveraged. In this talk, we will introduce you to EESSI, outline the use cases it enables, present recent developments, and give an outlook to the promising future of EESSI. **About the speaker** Kenneth Hoste is a computer scientist and FOSS enthusiast from Belgium. He holds a Masters (2005) and PhD (2010) in Computer Science from Ghent University. His dissertation topic was "Analysis, Estimation and Optimization of Computer System Performance Using Machine Learning". Since October 2010, he is a member of the HPC team at Ghent University where he is mainly responsible for user support & training. As a part of his job, he is also the lead developer and release manager of EasyBuild, a software build and installation framework for (scientific) software on High Performance Computing (HPC) systems. Since 2020, he is actively involved with the European Environment for Scientific Software Installations (EESSI) project, which aims to provide a central stack of scientific software installations that can be used across a wide range of systems, without compromising on performance. In his free time, he is a family guy and a fan of loud music, frequently attending gigs and festivals. He enjoys helping people & sharing his expertise, and likes joking around. He has a weak spot for stickers.
14:20
Software infrasructure of XENONnT
-
Joran Angevaare
(
Nikhef - University of Amsterdam (the Netherlands)
)
Software infrasructure of XENONnT
Joran Angevaare
(
Nikhef - University of Amsterdam (the Netherlands)
)
14:20 - 14:40
Room: Z009 "Eulerzaal"
14:40
CVMFS for KM3NeT
-
Mieke Bouwhuis
(
NIKHEF
)
CVMFS for KM3NeT
Mieke Bouwhuis
(
NIKHEF
)
14:40 - 15:00
Room: Z009 "Eulerzaal"
15:00
CVMFS mix at CERN
-
Fabrizio Furano
(
CERN
)
CVMFS mix at CERN
Fabrizio Furano
(
CERN
)
15:00 - 15:20
Room: Z009 "Eulerzaal"
The CVMFS deployment at CERN is constant evolution and adjustment to follow user's needs. This talk briefly describes some of the main points of the deployment of CVMFS at CERN, and the challenges that are involved.
15:20
Coffea Break
Coffea Break
15:20 - 16:00
Room: Z009 "Eulerzaal"
16:00
ATLAS Cloud R&D, Kubernetes and CVMFS
-
Fernando Harald Barreiro Megino
(
University of Texas at Arlington
)
ATLAS Cloud R&D, Kubernetes and CVMFS
Fernando Harald Barreiro Megino
(
University of Texas at Arlington
)
16:00 - 16:20
Room: Z009 "Eulerzaal"
Different groups, sites and experiments in the WLCG community have started using Kubernetes to manage services, implement novel analysis facilities or run batch services . Despite being in a native containerised environment, many of these use cases depend on CVMFS to stay compatible with the existing Grid model or to benefit from a well established software distribution model. Multiple of these groups therefore implemented their own helm charts and images to install the CVMFS client on Kubernetes. In the case of the ATLAS Cloud R&D we use Kubernetes native batch capabilities to integrate multiple research and commercial clouds with the PanDA ecosystem. The clusters can be quickly scaled up and down, and the nodes are usually fully exploited. The CVMFS client needs to run like a clock to avoid expensive job failures due to hanging clients. We spent a significant amount of time comparing and customising some of the existing CVMFS clients until we reached a satisfactory situation. This contribution will give an overview of some CVMFS clients, describe the most common issues and propose where the expertise of the CVMFS team and an official Kubernetes client would make a difference for the WLCG community.
16:20
A new UK StashCache at Edinburgh for DUNE
-
Wenlong Yuan
(
Edinburgh University
)
A new UK StashCache at Edinburgh for DUNE
Wenlong Yuan
(
Edinburgh University
)
16:20 - 16:40
Room: Z009 "Eulerzaal"
The Deep Underground Neutrino Experiment (DUNE) utilizes CVMFS and CVMFS StashCache to distribute both its software stack and reference files for distributed computing workflows. DUNE utilizes CVMFS as it provides a read-only POSIX interface to StashCache, with the redundant features, e.g. built-in GeoIP locating, rate monitoring, and fallback in failures. During Dec-’21 to Jan-’22, it was found that DUNE HTC grid jobs were suffering from low CPU efficiency in the UK, due to slow access of data from a StashCache instance at Cardiff, which was intended only for LIGO jobs. A new UK StashCache instance at Edinburgh was implemented to solve this problem for DUNE as well as the other OSG experiments except LIGO in the UK. This talk will be from the perspective of the DUNE use case, to introduce the motivation, diagnosis of the inefficiency, deployment process, and current status/performance of this StashCache instance at Edinburgh.
16:40
CVMFS in Canadian Advanced Research Computing
-
Ryan Taylor
(
University of Victoria (CA)
)
CVMFS in Canadian Advanced Research Computing
Ryan Taylor
(
University of Victoria (CA)
)
16:40 - 17:00
Room: Z009 "Eulerzaal"
**About the speaker** Pursuing his passion to figure out the inner workings of complex systems, Ryan Taylor completed a M.Sc. in particle physics with the ATLAS experiment at the Large Hadron Collider at CERN in 2009, and joined the University of Victoria Research Computing Services team shortly thereafter, where he is now a senior Advanced Research Computing (ARC) specialist. He has expertise in areas including grid computing and storage, cloud computing, content distribution technologies such as CVMFS, and container orchestration technologies in the Kubernetes ecosystem. As the leader of the Canadian CVMFS National Team, he guides the architecture and deployment strategy of the infrastructure which provides the Compute Canada ARC software stack and other content to sites and users across Canada and the global research community.
17:00
Workshop Closing
Workshop Closing
17:00 - 17:20
Room: Z009 "Eulerzaal"