ARC Control Tower: A flexible generic distributed job management framework

14 Apr 2015, 14:30
15m
B250 (B250)

B250

B250

oral presentation Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing Track 4 Session

Speaker

Jon Kerr Nilsen (University of Oslo (NO))

Description

While current grid middlewares are quite advanced in terms of connecting jobs to resources, their client tools are generally quite minimal and features for managing large sets of jobs are left to the user to implement. The ARC Control Tower (aCT) is a very flexible job management framework that can be run on anything from a single user’s laptop to a multi-server distributed setup. aCT was originally designed to enable ATLAS jobs to be submitted to the ARC CE. However, with the recent redesign of aCT where the ATLAS specific elements are clearly separated from the ARC job management parts, the control tower can now easily be reused as a flexible generic distributed job manager for other communities. This paper will give a detailed explanation how aCT works as a job management framework and go through the steps needed to create a simple job manager using aCT and show that it can easily manage thousands of jobs.

Primary authors

Andrej Filipcic (Jozef Stefan Institute (SI)) David Cameron (University of Oslo (NO)) Jon Kerr Nilsen (University of Oslo (NO))

Presentation materials