Speaker
P. Conde MUINO
(CERN)
Description
During the runtime of any experiment, a central monitoring system that
detects problems as soon as they appear has an essential role. In a large
experiment, like Atlas, the online data acquisition system is
distributed across the nodes of large farms, each of them running several
processes that analyse a fraction of the events. In this architecture, it is
necessary to have a central process that collects all the monitoring data from the
different nodes, produces full statistics histograms and analyses them.
In this paper we present the design of such a system, called the "gatherer". It
allows to collect any monitoring object, such as histograms, from the farm nodes,
from any process in the DAQ, trigger and reconstruction chain. It also adds up the
statistics, if required, and processes user defined algorithms in order
to analyse the monitoring data. The results are sent to a centralized display, that
shows the information online, and to the archiving system, triggering alarms in case
of problems.
The innovation of our approach is that conceptually it abstracts the
several communication protocols underneath, being able to talk with different
processes using different protocols at the same time and, therefore, providing
maximum flexibility. The software is easily adaptable to any trigger-DAQ system.
The first prototype of the gathering system has been implemented for Atlas and will
be running during this year's combined test beam.
An evaluation of this first prototype will also be presented.