AI Monitoring
Attendance
IT-CF Luis Pigueiras, Massimo Paladin, Miguel Santos, Pedro Andrade, Ivan Fedorko (speaker)
IT-CIS Marek Domaracky
IT-CS Véronique Lefebure
IT-DB Georgios Kaklamanos
IT-DI Denise Heagerty
IT-DSS Alex Iribarren, Jan Iven
IT-OIS Tim Bell, Zilli Stefano
IT-PES Gavin Mccance, Ioannis Agtzidis, Manuel Guijarro, Steve Traylen, Vítor Gouveia
PH-ATLAS Alexey BuzyKaev, Sergey Baranov, Yuri Smirnov
PH-LBC Loic Brarda
PH-LCD Andre Sailer
PH-UCM Ivan Glushkov
Questions
Q - Are you going to provide common tools to check the status of the node?
A - There will be tools, lemon cli, roger, etc...
Q - Why roger isn't in the architecture?
A - Roger is a snow consumer and doesn't provide any notification
Q - What are the plans for the dashboard?
A - The dashboard is a secondary tool that provides an overall vision of what happened on the last days.
Q - How is the integration between metrics and FE? Can we define a metric to reach the service manager?
A - Each metric has is own responsible and an associated FE, with the exception of the hardware at the moment we don't know where is supposed to go. It is possible to define metric per box. The puppet variables define the responsible.
Q - Who is the metric manager?
A - The metric manager is the responsible of the egroup.
Q - There are some situations where the desired FE target for many Lemon exceptions in SNOW should be the owning FE of box (e.g swap full). What to do?
A - The tickets get redirected to the application owner rather than the operators.
Q - What is the status of the migration of the Alarms and how many alarms are defined in Quattor?
A - Around 500 alarms and some of them are legacy, not all of them are going to be migrated
Q- Where is the definition of the alarms and exceptions?
A - The notification message contains a link where we can check the meaning of the alarms/exception.
Future meetings
The next meeting will be in 2 weeks and is expected to be about the new development workflow of the Configuration Management.