Speaker
Andrew David Lahiff
(STFC - Science & Technology Facilities Council (GB))
Description
Even with the growing interest in cloud computing, grid-based submission to traditional batch systems is still the primary way for the experiments to run jobs at WLCG sites. Integrating a batch system with virtualised worker nodes on a cloud potentially offers sites many benefits. At RAL we have recently investigated making opportunistic use of a private StratusLab cloud when it has unused resources and there are idle jobs in the batch system. Our ability to do this is greatly simplified due to our migration of the batch system to HTCondor, currently in progress. Here we describe the work that has been done so far, present preliminary results and discuss some of the issues raised by the testing, including virtualisation overheads, fairshares, virtual machine lifetimes, and monitoring requirements for dynamic environments.
Author
Andrew David Lahiff
(STFC - Science & Technology Facilities Council (GB))
Co-authors
Ian Collier
(UK Tier1 Centre)
Orlin Alexandrov
(STFC - Science & Technology Facilities Council (GB))