10–14 Oct 2016
San Francisco Marriott Marquis
America/Los_Angeles timezone

Online computing architecture for the CBM experiment at FAIR

11 Oct 2016, 15:00
15m
Sierra A (San Francisco Mariott Marquis)

Sierra A

San Francisco Mariott Marquis

Oral Track 1: Online Computing Track 1: Online Computing

Description

The Compressed Baryonic Matter (CBM) experiment is currently under construction at the upcoming FAIR accelerator facility in Darmstadt, Germany. Searching for rare probes, the experiment requires complex online event selection criteria at a high event rate.

To achieve this, all event selection is performed in a large online processing farm of several hundred nodes, the "First-level Event Selector" (FLES). This compute farm will consist primarily of standard PC components including GPGPUs and many-core architectures. The data rate at the input to this compute farm is expected to exceed 1 TByte/s of time-stamped signal messages from the detectors. The distributed input interface will be realized using custom FPGA-based PCIe add-on cards, which preprocess and index the incoming data streams.

At event rates of up to 10 MHz, data from several events overlaps. Thus, there is no a priori assignment of data messages to events. Instead, event recognition is performed in combination with track reconstruction. Employing a new container data format to decontextualize the information from specific time intervals, data segments can be distributed on the farm and processed independently. This allows to optimize the event reconstruction and analysis code without additional networking overhead and aids parallel computation in the online analysis task chain.

Time slice building, the continuous process of collecting the data of a time interval simultaneously from all detectors, places a high load on the network and requires careful scheduling and management. Using InfiniBand FDR hardware, this process has been demonstrated at rates of up to 6 GByte/s per node in a prototype system.

The design of the event selector system is optimized for modern computer architectures. This includes minimizing copy operations of data in memory, using DMA/RDMA wherever possible, reducing data interdependencies, and employing large memory buffers to limit the critical network transaction rate. A fault-tolerant control system will ensure high availability of the event selector.

This presentation will give an overview of the online event selection architecture of the upcoming CBM experiment and discuss the premises and benefits of the design. The presented material includes latest results from performance studies on different prototype systems.

Primary Keyword (Mandatory) Trigger
Secondary Keyword (Optional) High performance computing
Tertiary Keyword (Optional) DAQ

Primary author

Jan de Cuveland (Johann-Wolfgang-Goethe Univ. (DE))

Co-author

Volker Lindenstruth (Johann-Wolfgang-Goethe Univ. (DE))

Presentation materials