Donato De Girolamo (INFN) Stefano Zani
The monitoring and alert system is fundamental for the management and the operation of the network in a large data center such as an LHC Tier-1. The network of the INFN Tier-1 at CNAF is a multi-vendor environment: for its management and monitoring several tools have been adopted and different sensors have been developed. In this paper, after an overview on the different aspects to be monitored and the tools used for this (i.e. MRTG, Nagios, Arpwatch, NetFlow, Syslog, etc), we will describe the "NetBoard", a monitoring toolkit developed at the INFN Tier-1. NetBoard, developed for a multi-vendor network, is able to install and auto-configure all tools needed for its monitoring, either via network devices discovery mechanism or via configuration file or via wizard. In this way, we are also able to activate different types of sensors and Nagios checks according to the equipment vendor specifications. Moreover, when a new devices is connected in the LAN, NetBoard can detect where it is plugged. Finally the NetBoard web interface allows to have the overall status of the entire network "at a glance", both the local and the geographical (including the LHCOPN and the LHCONE) link utilization, health status of network devices (with active alerts) and flow analysis.
Donato De Girolamo (INFN)