19-23 May 2014
Europe/Paris timezone

Experiences with ATLAS and LHCb jobs in Vac virtual machines

23 May 2014, 11:10
Auditorium Marcel Vivargent (LAPP)

Auditorium Marcel Vivargent


Grid, Cloud & Virtualisation Grids, clouds, virtualisation


Andrew McNab (University of Manchester (GB))


We present experiences with running ATLAS and LHCb production jobs in virtual machines at Manchester and other sites in the UK using Vac. Vac is a self-contained VM management system in which individual hypervisor hosts act as VM factories to provide VMs contextualized for experiments, and offers an alternative to conventional CE/Batch systems and Cloud interfaces to resources. In the Vacuum model implemented by Vac, VMs appear spontaneously at sites, with contextualizations provided by the sites using templates provided by the experiments. This system takes advantage of the pilot job frameworks for managing jobs and cvmfs for managing software distribution, which together lead to these contextualizations being extremely simple in practice. Vac is implemented as a daemon, vacd, which runs on each hypervisor host. Each daemon uses a peer-to-peer UDP protocol to gather information from other Vac daemons at the site about what mix of experiment VMs are already running, and acts autonomously to decide which VMs to start using a policy given in its configuration file. The UDP protocol is also used to avoid starting VMs for experiments which have no work available, by detecting when a VM has been started recently and has stopped immediately because the pilot framework client could find no work. Vac has been running LHCb production jobs since 2013 and in 2014 a suitable ATLAS VM contextualization was developed and has been used to run ATLAS production work too. We present some preliminary comparisons of the efficiency of running LHCb and ATLAS jobs on batch worker nodes and in virtual machines using the same hardware.

Primary author

Andrew McNab (University of Manchester (GB))

Presentation materials