Speaker
Dr
Jiri Chudoba
(Institute of Physics, Prague)
Description
Many computing farms use as a local batch system management PBSPro or its free
version OpenPBS, respectively Torque and Maui products. These packages are delivered
with graphical tools for a status overview, but summary and detailed reports from
accounting log files are not available. This poster describes set of tools we are
using for an overview of resources consumption in a last few hours and days. Tools
can be run regularly to monitor finished jobs. They are able to send an alarm if some
condition appears – typically a large amount of very short jobs on a worker node is a
sign of a misconfigured node. No database is needed to run these tools. This
simplifies an installation and usage of PLAT, but limits its use to statistics from
tens of thousands of jobs.
Primary author
Dr
Jiri Chudoba
(Institute of Physics, Prague)