HEPiX Batch Monitoring Working Group


Since last meeting: people filled in their monitoring stack description on TWiki, thank you!  https://twiki.cern.ch/twiki/bin/view/HEPIX/BatchsysMonitoring


Jose Caballero set up GitHub: https://github.com/HEPiX-batchmonitoring/

⇒@all: send your GitHub username to Jose, he can add you





Interests for future? What to get from this meeting?

  • Accounting: what to report back to the experiments

  • Twiki page for adding ideas for future discussion


Who is using ElasticSearch for logging?

  • CCIN2P3 (Fabien)

  • BNL (Jose)

  • CERN (Jarka/Luis) -- ES for parsed logs storage in a subset of usecases, as a part of

  • MONIT/Timber, not much for HTCondor logs


Documenting on a twiki page the logging patterns used for digesting batch-system logs


Can HTCondor logs be normalized (e.g. for parsing by ElasticSearch)?

  • Not much done on this in recent months

  • Example goal: Be able to follow a job from submission, to execution, to completion

  • Could be a part of gangliad -> metricd work?

There are minutes attached to this event. Show them.
    • 09:00 09:20
      Focus down on Goals / Activities 20m
    • 09:20 09:45
      Monitoring batch logs (ELK / Others) 25m