Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

Oct 25 – 29, 2021
Europe/Zurich timezone

Hardware and batch monitoring at CNAF

Oct 26, 2021, 7:25 PM
25m
Online workshop

Online workshop

Computing & Batch Services Computing & Batch Services

Speaker

Dr Federico Versari (INFN)

Description

CNAF Tier-1, composed of almost 1000 worker nodes and nearly 40000 cores, completed its migration to HTCondor more than one year ago. After having adapted existing monitoring tools (built with Sensu, Influx and Grafana) to work with the new batch system, an effort has started to collect a more rich and “condor oriented” set of metrics that are used to provide better insights on the pool status.

Moreover we developed a similar tool with bare metal information collection, enabling sysadmins to have a global view of hardware (IPMI) events on the farm.

Desired slot length 15
Speaker release Yes

Primary author

Presentation materials