Speaker
Mrs
Doris Burckhart
(CERN)
Description
The Atlas Data Acquisition (DAQ) and High Level Trigger (HLT) software system will be
comprised initially of 2000 PC nodes which take part in the control, event readout,
second level trigger and event filter operations. This high number of PCs will only
be purchased before data taking in 2007. The large CERN IT lxbatch facility provided
the opportunity to run in July 2005 online functionality tests over a period of 5
weeks on a stepwise increasing farm size from 100 up to 700 pc dual nodes. The
interplay between the control and monitoring software with the event readout, event
building and the trigger software has been exercised the first time as an integrated
system on this large scale. New was also to run algorithms in the online environment
for the trigger selection and in the event filter processing tasks on a larger scale.
A mechanism has been developed to package the offline software together with the
DAQ/HLT software and to distribute it via peer-to-peer software efficiently to this
large pc cluster. The findings obtained during the tests lead to many immediate
improvements in the software. Trend analysis allowed identifying critical areas.
Running an online system on a cluster of 700 nodes successfully was found to be
especially sensitive to the reliability of the farm as well as the DAQ/HLT system
itself and the future development will concentrate on fault tolerance and stability.
Primary author
Mrs
Doris Burckhart
(CERN)