Assessment of the ALICE O2 readout servers

Filippo Costa (CERN)


The ALICE Experiment at CERN LHC (Large Hadron Collider) is undertaking a major upgrade during LHC Long Shutdown 2 in 2019-2020. The raw data input from the detector will then increase a hundredfold, up to 3.4 TB/s. In order to cope with such a large throughput, a new Online-Offline computing system, called O2, will be deployed.

The FLP servers (First Layer Processor) are the readout nodes hosting the CRU (Common Readout Unit) cards in charge of transferring the data from the detector links to the computer memory. The data then flows through a chain of software components until it is shipped over network to the processing nodes.

In order to select a suitable platform for the FLP, it is essential that the hardware and the software are tested together. Each candidate server is therefore equipped with multiple readout cards (CRU), one InfiniBand 100G Host Channel Adapter, and the O2 readout software suite. A series of tests are then run to ensure the readout system is stable and fulfils the data throughput requirement of 42Gb/s (highest data rate in output of the FLP equipped with 3 CRUs).

This paper presents the software and firmware features developed to evaluate and validate different candidates for the FLP servers. In particular we describe the data flow from the CRU firmware generating data, up to the network card where the buffers are sent over the network using RDMA. We also discuss the testing procedure and the results collected on different servers.

Primary authors

Filippo Costa (CERN) Sylvain Chapeland (CERN) Kostas Alexopoulos (Ministere des affaires etrangeres et europeennes (FR)) Ulrich Fuchs (CERN)

