AI How to: Additional software and AI Repositories

Europe/Zurich
513/1-024 (CERN)

513/1-024

CERN

50
Show room on map
Slides
Attendance:
IT-PES: Alex Lossent, Vitor Gomes, Nacho Barrentos, Manuel Guijarro, Nils Hoimyr
IT-CS: Veronique Lefebure
IT-DI: Denise Heagerty
IT-DSS: Jan Iven
IT-CIS: Pedro Ferreira, Marek Domaracky
IT-OIS: Jan Van Eldik, Tim Bell, Thomas Oulevey, Andreas Wagner
ATLAS: Vladimir Glafirov, Sergey Baranov, Alexey Buzykaev
CMS: Jorge Molina, Diego Da Silva Gomes
LHCb: Joel Closier, Loic Brarda
LCD: Andre Suiler, Stephane Poss


Questions:

Is it possible to lock a pkg version in manifest? 
=> Yes, for instance using the package resource in puppet. But this must be made carefully, for instance one also needs to exclude from the repo distro-sync process so that puppet and distro-sync don't try to install conflicting versions.
Best practices for this kind of scenario have yet to be fully investigated and documented. It might be possible to use yum only (and never specify a package version in Puppet)
Note that using ensure => 'latest' in Puppet should be banned eventually, cf. https://its.cern.ch/jira/browse/AI-1413
Some additional thoughts about locking a pkg version: https://its.cern.ch/jira/browse/AI-2274
 
(JC) how do we know the available snapshot dates?
=> basically we have a snapshot for each and every day.
(TO) note that we currently have 2 months of snapshots only. This will need to be adjusted as necessary.
 
(JC) what happens when a snapshot expires?
=> Not clearly known, but presumably yum will fail and nothing will change on the machine…
We can probably expect that snapshots will be kept forever (at least those in use), given that a snapshot uses little resources (hardlinks).
 
(SB) we need to keep RPMs (and used snapshots) forever like in Quattor. E.g. we need to be able to install new machines with an old snapshot.
=> (TO) not a problem. 
 
(JI) can we disable the slc6 distro-sync upgrade process?
=> (NB) technically yes, but it is expected that all nodes use this system.
VG: Need a way to flag machines that should not be updated… 
AL: agreed. E.g. consider an EMI upgrade. A number of actions must be taken before and after the RPM upgrade, so the RPM upgrade must take place in a controlled way.
 
(VG) We need "dynamic"/private environments.
=> AL: will be available. Already used internally in AI.
JI: "dynamic" environments will live independently from the main branch, will have to be rebased regularly.
AL: yes. At the same time dynamic environments are intended to develop new features/test changes, then be merged with the main branch – so we can assume that they should not live for a long time for this to be a problem.

Repository replication: must the source be a repo itself? what would be synchronization frequency with the source? Do these replicated repos have daily snapshots as well?
=> (TO) both replicating a yum repo and creating the repo metadata for a list of RPMs is feasible. Sync could be done once an hour (typically).
(TO) snapshots may also be generated for extra repos.
 
(atlas) do we need to update RPMs in both Quattor/swrep and AI?
=> yes, while the 2 systems live concurrently. No mechanism is planned to sync repos between quattor and AI/puppet.
 
(VG) we really  need an option to "disable extra packages" like in Quattor. 
=> (AL) see notes from previous meeting. This is the cost to pay for not having to specify all dependencies for a package. Technically one could so something like this: retrieve list of packages managed by Puppet from puppetdb, then build graph dependencies and see if any extra packages is present. But no plans for this to be provided as a central service.
ADDENDUM 15-Apr-2013: We can tell Puppet this to ask for any unspecified RPM to be removed, but then we get errors if we did not explicitly specify in Puppet all the dependencies for the RPMs we want (like in Quattor). The noop parameter can be used to test what would happen... See also AI-598
resources { 'package': purge=>true, noop => true
 
(JC) I recently was spammed by foreman about puppet reports.
=> this was a mistake. These reports can give information to machine owners about how successful were recent Puppet runs.
NB: it is expected you can subscribe to those reports if they are useful for  you. The information can also be found at all times on the foreman website.

Next meeting: in order to move forward, Atlas mentioned they need to be able to define their own HW definitions (flavours), define partitions, and also needs a final "shared" (not private) tenants. The next VOC WG meeting will cover these topics.
There are minutes attached to this event. Show them.
    • 14:00 14:05
      Introduction
    • 14:05 14:20
      System update for AI
    • 14:20 14:40
      Deploy additional software
    • 14:40 15:00
      Demo and questions