25–29 May 2026
Chulalongkorn University
Asia/Bangkok timezone

Evolution of Kubernetes at CERN: Automated Management of Clusters and Add-ons

Not scheduled
18m
Chulalongkorn University

Chulalongkorn University

Poster Presentation Track 7 - Computing infrastructure and sustainability Poster

Speaker

Jack Charlie Munday

Description

The Kubernetes platform operated by CERN IT has supported scientific computing, online services and accelerator controls since 2016. It enables fully automated deployment and management of clusters with native integration to CERN storage systems (CVMFS, EOS, AFS, CEPH), authentication (SSO, Kerberos) and networking. Today the service spans more than 600 clusters across CERN’s two main datacenters and within air-gapped environments of the Technical Network (TN), serving campus services, experiment workflows and critical accelerator applications.

This contribution reviews the evolution of the service, lessons learned from long-term production usage, and adaptation to ongoing changes in Kubernetes and its ecosystem. We highlight operational improvements driven by increasing user diversity, scaling requirements and the need to reduce technical debt. As more sites in WLCG and the HEP community transition to a similar stack these lessons should be of value to many.

We then present the next-generation architecture, centred on ClusterAPI to unify and automate cluster provisioning and lifecycle management. In addition, a fully GitOps-based approach using ArgoCD enables automated management and upgrades of cluster add-ons, improving reproducibility, maintainability and service scalability across heterogeneous environments.

Authors

Presentation materials

There are no materials yet.