Prototype Design of Timing and Fast Control in the CBM experiment

21 Sept 2021, 11:40
16m
Oral Systems, Planning, Installation, Commissioning and Running Experience Systems, Planning, Installation, Commissioning and Running Experience

Speaker

Vladimir Sidorenko (Karlsruhe Institute of Technology)

Description

The Compressed Baryonic Matter (CBM) experiment is designed to handle interaction rates of up to 10 MHz and up to 1 TB/s of raw data generated. With free-streaming data acquisition in the experiment and beam intensity fluctuations, it is expected that occasional data bursts will surpass bandwidth capabilities of the Data Acquisition System (DAQ) system. In order to preserve event data, the bandwidth of DAQ must be throttled in an organized way with minimum information loss. The Timing and Fast Control (TFC) system provides a latency-optimized datapath for throttling commands and distribute a system clock together with a global timestamp.

Summary (500 words)

Timing and Fast Control (TFC) system is being developed for the Compressed Baryonic Matter (CBM) experiment to orchestrate the data acquisition process and prevent the Data Acquisition (DAQ) network from congestion. TFC system has a dual purpose in the experiment: it distributes the experiment-wide system clock along with the 64-bit global timestamp and provides the datapath for DAQ bandwidth throttling. The need for a throttling mechanism arises from the expected beam intensity fluctuations that will occasionally cause data generation beyond processing and buffering capabilities of the DAQ system.
Experimental data in CBM is generated in radiation-hard Front-End Boards (FEB) and transmitted over e-links to GBTx ASICs. These ASICs, in turn, forward the collected data to the Common Readout Interface (CRI) boards over optical GBT links at 4.8 Gb/s rate. Up to 200 CRI boards constitute the entry stage of the First-Level Event Selector (FLES) network, where event reconstruction takes place.
The number of CRI boards in the DAQ system defines the scalability requirement for the TFC system, which must serve up to 200 endpoints. In order to ensure scalability and centralized architecture, TFC network has a hierarchical topology with Master, Submaster and Endpoint nodes defined. The nodes in the TFC network are connected with 4.8 Gb/s bi-directional optical links and the data flow is organized in short messages. Whereas Master and Submaster nodes are based on dedicated FPGA boards, each Endpoint is an FPGA core integrated into CRI firmware.
In the current prototype design of the TFC system, the 40 MHz system clock and the global timestamp are propagated from the Master to Endpoints in a flooding fashion. When received from the “upstream” link, both the clock signal and the timestamp are reused locally by Submasters and forwarded over each “downstream” link. Since the 40 MHz clock is extracted from the higher transport frequency, dedicated logic eliminates phase uncertainty introduced by clock division in each node.
Synchronization and fast control traffic will be sharing the network and have different latency requirements, which drives the need for traffic prioritization. In TFC, this is handled by request-based access to optical links, where request arbitration is done with a priority encoder.
In this paper, a prototype of the TFC system will be presented. Its latency and synchronization properties are studied on the target BNL-712 FPGA platform and a Xilinx Ultrascale+ evaluation board.

Author

Vladimir Sidorenko (Karlsruhe Institute of Technology)

Co-authors

Steffen Baehr (Karlsruhe Institute of Technology) Walter Mueller (Unknown) Prof. Jürgen Becker (Karlsruhe Institute of Technology)

Presentation materials