Speaker
Description
Summary
With their high number of channels, high event rates and studies of
rare events, the experiments at the LHC apply new requirements to
data acquisition systems. A solution for a high-performance scalable
trigger processing farm based on commodity computing nodes
interconnected with a ring-based network implementing a 2-D torus
topology is proposed. The system is capable of processing small data
sets of less than 200 byte at input rate of more than a megahertz.
Key system features are high transaction rate, low maximum latency
and scalability. Data transport is implemented completely in
hardware, thus excluding any CPU overhead.
Transfer of single event data consists of multiple remote memory
write transactions executed by distributed feeding nodes against one
and same remote computing node.
Such a many-to-one system is prone to congestions at the receiver
since simultaneous reception of data from multiple sources at the
destination leads to overflow in the receiver’s input queues and
subsequent retry traffic, which results in considerable system
performance degradation. The maximum input data rate in a 30 nodes
system prototype with a basic flow control mechanism realizing
static destinations allocation has been measured to be limited up to
about 1.3 MHz. A more sophisticated data flow control system has
been realized, which led to significant performance increment.
The flow-control system employs mainly on two devices: a centralized
scheduling unit, which realizes dynamic allocation of free computing
nodes such that no congestions in the system occur, and a custom
scalable point-to-point serial network that interconnects the
scheduler with the feeding nodes to initiate transfer of data
to the right destination at the right time.
By reason of the input data rate (a microsecond), the scheduler has
been designed completely in hardware. Considering the requirements
for performance and scalability, and the limited edition, the
scheduler has been implemented in a big FPGA. Furthermore, the
device has been realized as a PCI expansion board to be easily
integrated into the system. The same custom-made multipurpose PCI
card has been used to implement the DMA engines needed in the
feeding nodes. The card has found a number of other applications
throughout the system prototyping and test phases.
The communication channel between the scheduler and the transmitting
nodes is implemented using LVDS signaling over standard STP cables
as this solution has been found most suitable to realize the desired
box-to-box fast, cheap, simple and reliable communication.
The maximum input data rate in a flow-controlled system prototype
has been measured to be 2.14 MHz, and the number of retries in a
transmitting node transferring 128 byte packets at the maximum input
data rate has been measured to be zero.
Measurements in an intentionally congested system have shown
intensive retry traffic and odd behaviour. It has been proven that
the maximum data rate in a properly flow-controlled heavy-loaded
system is limited by the transmitter’s maximum data throughput but
not due to congested receivers.