Speaker
Andrei Kazarov
(Petersburg Nuclear Physics Institute (PNPI))
Description
In order to meet the requirements of ATLAS data taking, the ATLAS Trigger-DAQ system
is composed of O(1000) of applications running on more than 2000 computers in a
network. With such system size, s/w and h/w failures are quite often. To minimize
system downtime, the Trigger-DAQ control system shall include advanced verification
and diagnostics facilities. The operator should use tests and expertise of the TDAQ
and detectors developers in order to diagnose and recover from errors, if possible
automatically. The TDAQ control system is built as a distributed tree of controllers,
where behavior of each controller is defined in a rule-based language allowing easy
customization. The control system also includes verification framework which allow
users to develop and configure tests for any component in the system with different
levels of complexity. It can be used as a stand-alone test facility for a small
detector installation, as part of the general TDAQ initialization procedure, and for
diagnosing the problems which may occur during the run time. The system is currently
being used in TDAQ commissioning at the ATLAS pit and by subdetectors for stand-alone
verification of the hardware before it is finally installed. The paper describes the
architecture and implementation of TDAQ control system with more emphasis on the new
features developed for the verification framework, features requested by users during
it's exploitation in real environment. Results from scalability tests performed in
2005 are also presented.
Primary authors
Dr
Alina Corso-Radu
(European Organization for Nuclear Research (CERN))
Andrei Kazarov
(Petersburg Nuclear Physics Institute (PNPI))
Dr
marc dobson
(CERN)