Speakers
Description
CERN IT has offered a Kubernetes service since 2016, expanding to incorporate multiple other technologies from the cloud native ecosystem over time. Currently the service runs over 500 clusters and thousands of nodes serving use cases from different sectors in the organization.
In 2021 the ATS sector showed interest in looking at a similar setup for their container orchestration effort. A collaboration was started with an initial proof of concept running the CERN IT service inside the control room datacenter, including use cases from multiple teams in the sector. Following a successful initiative that ran over a year, a second phase was launched to bring the service to production.
In this paper we describe the existing CERN IT service and the major changes and improvements that were required to serve accelerator control use cases. We highlight the changes due to running in an isolated, air-gapped network environment, as well as the additional integrations regarding identity, storage and datacenter infrastructure. Finally we detail results from an extensive effort for failure scenario evaluation to comply with the expected service levels, as well as plans for extending the existing infrastructure to new use cases.