NaNet: a Reconfigurable PCIe Network Interface Card Architecture for Real-time Distributed Heterogeneous Stream Processing in the NA62 Low Level Trigger.

18 Sept 2018, 14:25
25m
CAR 1.09 (aula)

CAR 1.09 (aula)

Oral Trigger Trigger

Speaker

Paolo Cretaro (INFN - National Institute for Nuclear Physics)

Description

The NA62 experiment at CERN SPS is aimed at measuring the branching ratio of the very rare kaon decay K+-> pi+ nu nubar.
NaNet is the reconfigurable design of a FPGA-based PCI Express Network Interface Card with processing, RDMA and GPUDirect capabilities and support for multiple link technologies.
NaNet has been employed to implement a real-time distributed processing pipeline in the low level trigger of the experiment, operating on the data streams produced by the RICH detector with an orchestrated combination of heterogeneous computing devices (CPUs, FPGAs and GPUs).
Recent results collected during NA62 runs are presented and discussed.

Summary

The NA62 experiment at CERN SPS is aimed at measuring the branching ratio of the very rare kaon decay K+-> pi+ nu nubar.
A centralized level 0 hardware trigger system (L0TP) processes in real-time streams of primitives coming from the detectors readout boards in order to reject the considerable background.
Our approach aims at improving the L0TP performances distributing this processing over the whole chain starting from the earliest stages, i.e. the readout boards, and operating on the data streams with an orchestrated combination of heterogeneous computing devices (CPUs, FPGAs and GPUs).
The enabling element of this real-time distributed stream computing architecture is NaNet, a FPGA-based PCI Express Network Interface Card with processing, RDMA and GPUDirect capabilities, supporting multiple link technologies (1/10/40GbE and custom ones).
We have demonstrated, the effectiveness of our design by retrofitting the RICH detector to compute within 350 us the Cerenkov rings parameters, using the FPGA to implement the data receiving and coalescing of events split in 4 data streams while the GPU was in charge for the ring reconstruction, executing a parallel algorithm based on geometric considerations (Histogram).
Driven by the fact that FPGA architectures shows a relevant ramp up in their computing resources, along with the recent availabilty of frameworks for high level synthesis, we evaluated
the Intel FPGA SDK for OpenCL environment to assess feasibility of rings reconstruction directly on
the FPGA hosted on the NaNet card. The Histogram algorithm has been ported in this framework and optimized to make the most of the pipelined architecture.
Obtained results are discussed and compared with the GPU implementation.
Recent results collected during NA62 runs along with a detailed description of the latest developments in the NaNet architecture and an insight of future project developments are also presented and discussed.

Primary authors

Paolo Cretaro (INFN - National Institute for Nuclear Physics) Alessandro Lonardo (Sapienza Universita e INFN, Roma I (IT)) Andrea Biagioni (INFN) Luca Pontisso (Sapienza Universita e INFN, Roma I (IT)) Gianluca Lamanna (INFN e Laboratori Nazionali di Frascati (IT)) Roberto Piandani (INFN Sezione di Pisa, Universita' e Scuola Normale Superiore, P) Michele Martinelli (INFN) Piero Vicini (INFN Rome Section) Mrs Francesca Lo Cicero (Istituto Nazionale di Fisica Nucleare) Dario Soldi (Universita e INFN Torino (IT))

Co-authors

Pier Stanislao Paolucci (Istituto Nazionale di Fisica Nucleare) Elena Pastorelli (Istituto Nazionale di Fisica Nucleare) Marco Sozzi (INFN Sezione di Pisa, Universita' e Scuola Normale Superiore, P)

Presentation materials

Peer reviewing

Paper