9–13 Jul 2018
Sofia, Bulgaria
Europe/Sofia timezone

MONIT: Monitoring the CERN Data Centres and the WLCG Infrastructure

10 Jul 2018, 14:30
15m
Hall 10 (National Palace of Culture)

Hall 10

National Palace of Culture

presentation Track 8 – Networks and facilities T8 - Networks and facilities

Speaker

Alberto Aimar (CERN)

Description

The new unified monitoring (MONIT) for the CERN Data Centres and for the WLCG Infrastructure is now based on established open source technologies for collection, streaming and storage of monitoring data. The previous solutions, based on in-house development and commercial software, are been replaced with widely- recognized technologies such as Collectd, Flume, Kafka, ElasticSearch, InfluxDB, Grafana and others. The monitoring infrastructure, fully based on CERN cloud resources, covers the whole workflow of the monitoring data: from collecting and validating metrics and logs to making them available for dashboards, reports and alarms.

The deployment in production of this new DC and WLCG monitoring is well under way and this contribution provides a summary of the progress, hurdles met and lessons learned in using these open source technologies. It also focuses on the choices made to achieve the required levels of stability, scalability and performance of the MONIT monitoring service.

Primary authors

Alberto Aimar (CERN) Pedro Andrade (CERN) Dr Edward Karavakis (CERN) Luca Magnoni (CERN) Asier Aguado Corman (Universidad de Oviedo (ES)) Javier Delgado Fernandez (CERN) Borja Garrido Bear (Universidad de Oviedo (ES)) Dominik Marek Kulikowski (Wroclaw University of Science and Technology (PL))

Presentation materials