14–15 Mar 2024
CERN
Europe/Zurich timezone
There is a live webcast for this event.

Long Term Monitoring with Prometheus + Thanos

15 Mar 2024, 14:30
15m
31/3-004 - IT Amphitheatre (CERN)

31/3-004 - IT Amphitheatre

CERN

105
Show room on map

Speaker

Roberto Valverde Cameselle (CERN)

Description

The Storage and Data Management Group at CERN manages 20 EOS instances corresponding to almost 1000 servers and 100,000 disks. Having a good monitoring and alerting system is crucial not only for day-to-day activities but also as a tool to record the evolution of our services throughout the time. In this talk an overview of the monitoring tools that are used will be presented specially in regards of long-term metric preservation.

Presentation materials