Job schedulers in high energy physics require accurate information about predicted resource consumption of a job to assign jobs to the most reasonable, available resources. For example, job schedulers evaluate information about the runtime, numbers of requested cores, or size of memory, and disk space. Users, therefore, specify those information when submitting their jobs and workflows. Yet, the information provided by users cannot be considered entirely accurate. There are several reasons for this, including the heterogeneity of resources, regular changes to the underlying workflows, external dependencies or even lack of knowledge. This inaccuracy can result in inefficient utilisation of assigned resources by either blocking unused resources or exceeding reserved resources.
With the increasing demand for the integration of opportunistic resources to extend the available WLCG computing resources, the accuracy of predicted resource consumption is of particular importance. Only an accurate prediction of resource consumption can enable a proper selection, allocation, and integration of resources to minimise the overall costs. We therefore propose to improve the indicated resource consumption of end-users with predictions to improve the resource utilisation of allocated opportunistic resources.
In this contribution, we present our results and the impact of our prediction for both, end-user workflows and production workflows including pilot jobs. Our work focuses on resource consumption of CPU and memory but presents a generic approach that is ready for future use of other resources, such as GPUs.
|Consider for promotion||No|