Integration of Intelligence and Redundancy Elements into the FPGA-Based DAQ of the COMPASS Experiment

13 Sept 2017, 12:20
25m
Thimann I lecture hall (UCSC)

Thimann I lecture hall

UCSC

Thimann I lecture hall
Oral Systems, Planning, Installation, Commissioning and Running Experience Systems, Planning, Installation, Commissioning and Running Experience

Speaker

Dominik Steffen (Technische Universitaet Muenchen (DE))

Description

Using FPGA technology for event building tasks in high-energy physics experiments reduces costs and increases reliability of DAQ systems. In 2014, the COMPASS experiment at CERN’s SPS commissioned a novel, intelligent, FPGA-based DAQ (iFDAQ) in which event building is entirely performed by FPGA cards. The highly scalable system is designed to cope with an on-spill data rate of 1.5 GB/s and a sustained data rate of 500 MB/s. Its intelligent and highly reliable hardware event builder is able to handle and detect front-end errors and automatically take corrective action. The contribution will give an overview of system details, performance, and running experience.

Summary

Driven by the need of a highly scalable and high-performance computing architecture for data acquisition, the COMPASS experiment at CERN’s Super Proton Synchrotron (SPS) developed a new Data Acquisition System (DAQ) from scratch using a novel approach to the event building network. The new system and its event builder exploit the application-optimized computation technology of Field Programmable Gate Arrays (FPGAs). In contrast to traditional event builders which base on distributed online computers interconnected via an Ethernet Gigabit network, the event building task is solely executed in hardware. Recent developments in FPGA technology, such as increased I/O bandwidth (> 3 Gbps) and support for high-performance SDRAMs even on low-cost chips, made FPGAs suitable for event building purposes. Reduced costs, higher reliability, and increased compactness are the arguments to move from traditional to FPGA-based event builders in future.

COMPASS commissioned its intelligent, FPGA-based DAQ (iFDAQ) in 2014, when a reduced spectrometer required only a reduced event builder. During the following years, the system was extended and features were added. In 2017, the system will be deployed in its full scale and it will be able to cope with the expected on-spill data rate of 1.5 GB/s.

By buffering data on different levels, the iFDAQ exploits the spill structure of the SPS beam and averages the on-spill data rate over the whole SPS duty cycle to a sustained rate of 500 MB/s. The system uses a hybrid FPGA-software approach. The event building task is entirely performed by FPGAs, whereas the software is responsible for system control, user interfaces, configuration, and monitoring. The hardware event builder consists of multipurpose, custom designed FPGA-cards. These cards are equipped with 4 GB of DDR3 memory and 16 high-speed links. The event builder receives data from the front-end electronics via approximately 100 optical serial interfaces. It buffers and multiplexes data, combines event fragments to complete events, and finally distributes them to eight readout computers via FPGA PCIe cards. All hardware nodes are synchronized by the Trigger Control System (TCS). Monitoring and control of the hardware nodes is possible via a dedicated Ethernet network using IPbus protocol. Using three independent interfaces for slow control (IPbus), synchronization (TCS), and data flow (SLINK) increases the robustness of the system.
Built-in intelligence allows to dynamically react on faulty front-end modules. In order to ensure system stability and data integrity, too high data rate is throttled and wrongly formatted or missing data is replaced by empty but correct frames. Moreover, a steady data stream enables the system to continuously check the status of frontend modules and automatically point the user to problematic equipment.
From 2017, all involved point-to-point high-speed links between front-end electronics, the hardware event builder, and the readout computers are wired via a fully programmable crosspoint switch. This allows the user to remotely customize the network and hence simplifies compensation for hardware failure and optimization for load balancing. In a second step, the intelligent hardware will recognize load imbalance and malfunctioning hardware nodes by itself and will automatically take appropriate actions.

Primary author

Dominik Steffen (Technische Universitaet Muenchen (DE))

Co-authors

Dmytro Levit (Technische Universitaet Muenchen (DE)) Igor Konorov (Technische Universitaet Muenchen (DE)) Josef Novy (Czech Technical University (CZ)) Martin Bodlak (Charles University (CZ)) Miroslav Virius (Czech Technical University (CZ)) Ondrej Subrt (Czech Technical University (CZ)) Stefan Huber (Technische Universitaet Muenchen (DE)) Vladimir Frolov (Joint Institute for Nuclear Research (RU)) Vladimir Jary (Czech Technical University (CZ))

Presentation materials