Prometheus is a leading open source monitoring and alerting tool. Prometheus also utilizes a pull model, in the sense is pulls metrics from monitored entities, rather than receives them as a push. But sometimes this can be a major headache, even without security in mind, when performing network gymnastics to reach your monitored entities. Not only that, but sometimes system metrics might be required for consumption twice(For example you want to graph them, but at the same time you want to feed them to your fancy Machine Learning Apache Spark).
Luckily, Prometheus main market competitor InfluxDB arrives on the scene, to help the situation with the main InfluxDB accompaning product - Telegraf.
Telegraf is flexible, low profile, easily extensible metrics collector with a big community.
In this talk we will offer our insight on implementing a "Push" modeled monitoring system,where metrics could be consumed multiple times, based on Telegraf,Kafka and Prometheus. We will see some pitfalls we met during the implementation, scaling issues and how we overcame them.
We will also cover monitoring our monitoring system.