21–25 May 2012
New York City, NY, USA
US/Eastern timezone

Message Correlation Analysis Tool for NOvA

22 May 2012, 17:00
25m
Room 804/805 (Kimmel Center)

Room 804/805

Kimmel Center

Parallel Online Computing (track 1) Online Computing

Speaker

Qiming Lu (Fermi National Accelerator Laboratory)

Description

A complex running system, such as the NOvA online data acquisition, consists of a large number of distributed but closely interacting components. This paper describes a generic realtime correlation analysis and event identification engine, named Message Analyzer. Its purpose is to capture run time abnormalities and recognize system failures based on log messages from participating components. The initial design of analysis engine is driven by the DAQ of the NOvA experiment. The Message Analyzer performs filtering and pattern recognition on the log messages and reacts to system failures identified by associated triggering rules. The tool helps the system maintain a healthy running state and to minimize data corruption. This paper also describes a domain specific language that allows the recognition patterns and correlation rules to be specified in a clear and flexible way. In addition, the engine provides a plugin mechanism for users to implement specialized patterns or rules in generic languages such as C++.

Author

Qiming Lu (Fermi National Accelerator Laboratory)

Co-authors

James Kowalkowski (Fermi National Accelerator Laboratory (FNAL)) Kurt Biery (CMS/Fermilab)

Presentation materials