9–13 Jul 2018
Sofia, Bulgaria
Europe/Sofia timezone

Three weeks long hackathon - LHCb's Puppet 3.5 to Puppet 4.9 migration.

10 Jul 2018, 16:00
1h
Sofia, Bulgaria

Sofia, Bulgaria

National Culture Palace, Boulevard "Bulgaria", 1463 NDK, Sofia, Bulgaria
Poster Track 8 – Networks and facilities Posters

Speaker

Hristo Umaru Mohamed (CERN)

Description

Up until September 2017 LHCb Online was running on Puppet 3.5 Master/Server non redundant architecture. As a result, we had problem with outages, both planned and unplanned, as well as with scalability issues (How do you run 3000 nodes at the same time? How do you even run 100 without bringing down the Puppet Master). On top of that Puppet 5.0 was released, so we were running now 2 versions behind!
As Puppet 4.9 was the de facto standard, something had to be done right now, so a quick self inflicted three weeks long nonstop hackathon had to happen. This talk will cover the pitfalls, mistakes and architecture decisions we took when migrating our entire Puppet codebase nearly from scratch, to a more modular one, addressing both existing exceptions and anticipating arising ones in the future - All while our entire infrastructure was running in physics productions and on top of that causing 0 outages. We will cover mistakes we had made in our Puppet 3 installment and how we fixed them in the end, in order to lower cotalogue compile time and reduce our overall codebase around 50%.
We will cover how we setup a quickly scalable Puppet Core(Masters,CAs,Foreman,etc) infrastructure.

Primary author

Co-authors

Tommaso Colombo (CERN) Loic Brarda (CERN) Rainer Schwemmer (CERN) Niko Neufeld (CERN) Francesco Sborzacchi (INFN e Laboratori Nazionali di Frascati (IT))

Presentation materials