Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

Monit - OpenShift Logging Stack Review

Europe/Zurich
31/2-028 (CERN)

31/2-028

CERN

8
Show room on map
Description

The goal of the meeting would be to clarify what's the best path to move forward with replacing the current openshift logging stack based on fluentd. Several options are available:

  • custom deployment
  • MONIT chart + customizations
  • fluent operator

An important goal is for okd to be consistent with k8s clusters for monit and magnum so as to minimize effort duplication

Zoom Meeting ID
61166131797
Host
Guillermo Facundo Colunga
Useful links
Join via phone
Zoom URL
    • 10:00 12:00
      General Discussion

      1. Current Setup Overview (OpenShift)

      • Architecture Structure:
        • OpenShift logging is currently structured into two main layers: data collection and data aggregation. This layers are deployed on each cluster. And from the data aggregation layer logs are sent to IT Monitoring infrastructure via HTTP.
        • The system is FluentD-based, with initial testing (PoC) already conducted for Fluent Bit as a potential replacement in the data collection layer.

      2. Future Goals

      • Objectives for Architecture:
        • Address some issues with FluentD data collection that have been observed.
        • Maintain the stability and configuration consistency of the aggregation layer, reusing as much of the existing setup as possible.
        • Transition toward a more standardised architecture, aiming for shared knowledge and compatibility across the infrastructure.
      • Configuration Approach:
        • Custom configuration will be addressed later, following architectural standardisation. We will try to identify shared configurations after the initial setup.
      • Note: Metrics collection through the monitoring Helm chart is not planned as part of this setup.

      3. Discussions

      • For the aggregation layer looks fair to move towards Fluent Bit as it is the same the community is using. Different approaches to achieve this where discussed.
        • Custom chart: Not very attractive one as it would require extra work re-inventing the wheel.
        • Monit Agent + Customisations: The monitoring chart for the moment does not use the upstream Fluent Bit chart nor uses the Fluent Operator for seek of simplicity. But it is true that this can be evaluated deeply and if it allows more flexibility could be the way to go.
        • Fluent Operator vs Separated Upstream Fluentd and Fluent Bit charts: In this case the monitoring team cannot provide much information as they have not tested. Alex indicates that he has and he likes more the idea of using the two separated charts rather than the operator. Moreover he found that he could not replicate current Fluentd Aggregator config in the Fluent Operator.
      • A very nice thing would be to use all same versions of the same technologies for Puppet machines, Kubernetes clusters and OpenShift ones. This would allow to have more shared knowledge and debug power in case of issues.

      4. Action Items

      • Monitoring Team Tasks:
        • Explore integrating the existing Fluent Bit chart into the monitoring Helm chart to enable log collection within OpenShift. Success in this step will allow the OpenShift team to test their Fluent Bit configuration.
        • Contact the Magnum team to align possible requirements and current setup.