Implementation of the vacuum model using HTCondor

13 Apr 2015, 18:00
15m
C210 (C210)

C210

C210

oral presentation Track7: Clouds and virtualization Track 7 Session

Speaker

Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB))

Description

The recently introduced vacuum model offers an alternative to the traditional methods that virtual organisations (VOs) use to run computing tasks at sites, where they either submit jobs using grid middleware or create virtual machines (VMs) using cloud APIs. In the vacuum model VMs are created and contextualized by the site itself, and start the appropriate pilot job framework which fetches real jobs. This allows sites to avoid the effort required for running grid middleware or a cloud. Here we present an implementation of the vacuum model based entirely on HTCondor, making use of HTCondor's ability to manage VMs. Extensive use is made of job hooks, including for preparing fast local storage for use in the VMs, carrying out contextualization, and updating job ClassAds about the status of the VMs and their payloads. VMs for each supported VO are created at regular intervals. If there is no work or there are fatal errors, no additional VMs are created. On the other hand, if there is real work running, further VMs can be created. Since the HTCondor negotiator decides whether to run the VMs or not, fairshares are naturally respected. Normal grid or locally-submitted jobs can run on the same resources and share the same physical worker nodes that are also used as hypervisors for running VMs.

Primary author

Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB))

Presentation materials