Let's get our hands dirty: a comprehensive evaluation of DAQDB, key-value store for petascale hot storage.

Nov 5, 2019, 4:45 PM
Mr Grzegorz Jereczek (Intel Corporation)


Data acquisition (DAQ) systems are a key component for successful data taking in any experiment. The DAQ is a complex distributed computing system and coordinates all operations, from the data selection stage of interesting events to storage elements.
For the High Luminosity upgrade of the Large Hadron Collider (HL-LHC), the experiments at CERN need to meet challenging requirements to record data with a much higher occupancy in the detectors. The DAQ system will receive and deliver data with a significantly increased trigger rate, one million events per second, and capacity, terabytes of data per second.
An effective way to meet these requirements is to decouple real-time data acquisition from event selection. Data fragments can be temporarily stored in a large distributed key-value store. Fragments belonging to the same event can be then queried on demand, by the data selection processes.
Implementing such a model relies on a proper combination of emerging technologies, such as persistent memory, NVMe SSDs, scalable networking, and data structures, as well as high performance, scalable software.
In this paper, we present DAQDB, an open source implementation of this design that was presented earlier, with an extensive evaluation of this approach, from the single node to the distributed performance. Furthermore, we complement our study with an in-depth comparison with the state-of-the-art solutions, a description of the challenges faced and the lessons learned.

Danilo Cicalese (CERN) Fabrice Le Goff (CERN) Giovanna Lehmann Miotto (CERN) Mr Grzegorz Jereczek (Intel Corporation) Jakub Radtke (Intel) Mr Jakub Schmiegel (Intel) Jeremy Robert Love (Argonne National Laboratory (US)) Maciej Maciejewski Ms Malgorzata Szychowska (Intel) Remi Mommsen (Fermi National Accelerator Lab. (US))

