Oct 16 – 20, 2017
KEK
Asia/Tokyo timezone

Integrated Monitoring results at IHEP

Oct 19, 2017, 3:15 PM
25m
KEK

KEK

1-1 Oho, Tsukuba, Ibaraki 305-0801 Japan 36°09'01.0"N 140°04'28.1"E 36.150290, 140.074485
Basic IT Services Basic IT services

Speaker

Mr Qingbao Hu (IHEP)

Description

Various cluster monitoring tools are adapted or developed at IHEP, which show the health status of each device or aspect of IHEP computing platform separately. For example, Ganglia shows the machine load, Nagios monitors the service status, and Job-monitor tool developed by IHEP counts the job success rate and so on. But those monitoring data from different tools are independent and not easy to be analyzed relatively. Integrate and analysis all the monitoring data from multiple sources can provide more valuable information such as health trends and potential errors.

Now, Integrated Monitoring Tools are deployed at IHEP which collects Ganglia, Nagios, Syslog and other monitor metrics. Some cluster monitoring projects based on this Integrated Monitoring Tools have been applied to IHEP.

Desired length 20 minutes

Primary authors

Presentation materials