9-13 July 2018
Sofia, Bulgaria
Europe/Sofia timezone

Trident: A three pronged approach to analysing node utilisation

10 Jul 2018, 16:00
1h
Sofia, Bulgaria

Sofia, Bulgaria

National Culture Palace, Boulevard "Bulgaria", 1463 NDK, Sofia, Bulgaria
Poster Track 8 – Networks and facilities Posters

Speaker

Servesh Muralidharan (CERN)

Description

We describe the development of a tool (Trident) using a three pronged approach to analysing node utilisation while aiming to be user friendly. The three areas of focus are data IO, CPU core and memory.

Compute applications running in a batch system node will stress different parts of the node over time. It is usual to look at metrics such as CPU load average and memory consumed. However, this often does not provide enough information to form a detailed picture of how the system is performing and in most cases detecting performance problems is impossible.

Monitoring and collecting further performance metrics at near real time is intended to understand compute demands better and which changes can improve utilisation. We are investigating methodologies at CERN Tier-0 to allow collection of metrics such as memory bandwidth, detailed CPU core utilisation and active processor cycles. This is done with minimal overhead and without instrumenting the user code. When combined with modern analytics the metrics can provide information relevant to the users, developers and site administrators. The raw metrics are often difficult to interpret, hence development of a tool to allow the target communities to both collect and interpret resource utilisation data more easily.

Primary authors

Presentation Materials