Fermilab has made the strategic decision to deploy OKD, the open source version of Red Hat OpenShift, for Kubernetes container management. We will discuss our experience so far with OKD and describe some of the challenges we faced deploying a variety of applications.
Exploring how the kubelet, with Calico as the CNI plugin,
depends on the performance of the Kubernetes API server
to be able to start pods quickly.
At the moment, Kubernetes only supports horizontal pod autoscaling based on predefined pod metrics (CPU and memory usage). Therefore, in order to achieve an actually green elastic cloud model (optimizing resource usage) a key point is to integrate this tool with autoscaling solutions based on custom metrics, and this requires the usage of third-party elements.
In this work we show the...
The PanDA team has evaluated the possibility of native Kubernetes job submission in order to process ATLAS workloads and offer the possibility of immediate integration of major cloud computing providers. This model also offers a novel way to set up lightweight compute sites, without the need of setting up a Grid stack.
During the last year we have been running several queues at clusters...
Preparing a Systems experiment environment requires setting up infrastructure, baselining the infrastructure, installing dependencies and tools, running experiments, and manually plotting results, which if done manually, is cumbersome and error-prone. This same scenario applies to researchers starting to experiment with Ceph or SkyhookDM, which is an extension for Ceph to run queries on...
GPUs are scarce resources in many of our centers, including CERN.
This talk will quickly describe a multi cloud deployment with the goal of evaluating the performance of different workloads in all GPUs offered by GCP, Azure and AWS.
It will include some details about setting up clusters and GPUs in each of these clouds, and some preliminary results.
Starting in October 2020, the PATh project is making a concerted effort to transition the centrally-run OSG services (such as websites, software repositories, information services) from ad-hoc deployment models to Kubernetes.
To do so, we needed a Kubernetes "home" and an operational model! In this talk, we'll overview the work going on in the Tiger cluster at Morgridge, our current...
The CMS experiment heavily relies on the CMSWEB cluster to host critical services for its operational needs. The cluster is deployed on virtual machines (VMs) from the CERN OpenStack cloud and is manually maintained by operators and developers. The release cycle is composed of several steps, from building RPMs, their deployment, validation, and integration tests. To enhance the sustainability...
In this contribution we would like to share our experience designing an Analysis Facility for the columnar analysis utilizing the analysis package COFFEA at University of Nebraska-Lincoln and to describe our adventure on deploying different workloads and services at UNL Kubernetes cluster (Jupyterhub with Traefik integration, HTCondor, ServiceX and other infrastructure deployments).
In this presentation we'll discuss our experiences deploying a test REANA instance on a k8s cluster at BNL.
Will review progress over the past year with SLATE - including new containerized apps, storage provisioner, security policies for federated operations
I will describe Kubernetes cluster deployment at UVic, including batch computing and APEL accounting for ATLAS.
OSG lessons learned distributing service container images and experiences contributing to and deploying services with SLATE