MONIT/CMS Meeting

Name: MONIT/CMS Meeting
Start: 2020-06-15T14:00:00+02:00
End: 2020-06-15T15:00:00+02:00
Location: No location set

Monday 15 Jun 2020, 14:00 → 15:00 Europe/Zurich

- 1
  
  MONIT Status and Plans
  
  Speaker: Pedro Andrade (CERN)
  
  2020-06-15_MONIT-CMS.pdf
- 2
  CMS Status and Plans
  
  Speakers: Danilo Piparo (CERN), Federica Legger (Universita e INFN Torino (IT)), Valentin Y Kuznetsov (Cornell University (US))
  CMS feedback
  
  stability of the infrastructure, especially during downtimes of CERN services since monitoring information is very valuable during these times
  
  how to separate it and put on the high-availability mode
  
  since we rely on ES/InfluxDB we need tutorials about their QL
  
  with growths of the infrastructure, dashboards we need an easy tool to find appropriate information, similar to google search
  
  it may require data annotation, indexing, etc.
  
  we need to be periodically informed about R&Ds and directions MONIT is planning such that we can influence in a discussion on these subjects, e.g. if there is an internal Jira (ticketing system) which we can look and see
  
  ability to specify the severity level of tasks/tickets
  
  CMS adaptation to MONIT
  
  overall we start moving more aggressively to MONIT infrastructure
  
  usage of ES/Kibana/Grafana is growing among different CMS groups
  
  usage of HDFS is mostly up to experts
  
  HDFS workflows is hard to use/write/execute for an average user, therefore an additional layer may be more desired, e.g. Job Monitoring ES+Spark is a good example
  
  we start seeing growth in usage of Monit CLI
  
  CMS Plans
  
  expand usage of ES as a primary data-storage
  
  start validation of our data schemas during injection
  
  migrate all CLI tools to Go to avoid dependencies, env setup, etc.
  
  migrate highly populated dashboards (cardinality wise) to ES+Spark aggregated index
  
  intelligent alert notification system
  
  explore Rumble [1] as a layer on top of HDFS to query the data
  
  [1] https://indico.cern.ch/event/908539/contributions/3822566/attachments/2030916/3398965/2020_05_Rumble.pdf

Choose timezone

MONIT/CMS Meeting

CMS feedback

CMS adaptation to MONIT

CMS Plans