Speaker
Dr
Federico Versari
(INFN)
Description
CNAF Tier-1, composed of almost 1000 worker nodes and nearly 40000 cores, completed its migration to HTCondor more than one year ago. After having adapted existing monitoring tools (built with Sensu, Influx and Grafana) to work with the new batch system, an effort has started to collect a more rich and “condor oriented” set of metrics that are used to provide better insights on the pool status.
Moreover we developed a similar tool with bare metal information collection, enabling sysadmins to have a global view of hardware (IPMI) events on the farm.
Desired slot length | 15 |
---|---|
Speaker release | Yes |
Primary author
Dr
Federico Versari
(INFN)