Speaker
Description
Summary
The NA62 particle physics experiment at CERN SPS aims at measuring the ultra rare kaon decay $K^{+} \rightarrow \pi^{+} \nu\overline{\nu}$ as a highly sensitive test of the Standard Model and a search for New Physics.
A multi-level trigger is designed to manage the high rate required by the experiment.
The lowest level (L0) trigger represents an essential element because it must handle an input event rate of the order of 10 MHz and apply a rejection factor of 10, with a maximum latency of 1 ms.
In the standard implementation of the L0 trigger, data contributing to the realization of the final trigger decision are computed on FPGA devices, and are mostly based on event multiplicity and topology.
The approach presented here aims at exploiting the parallel computing power of a commercial GPU (Graphics Processing Unit) to perform real-time software-based computations in the L0 trigger for the NA62 experiment.
The use of a GPU in this level would allow for building of more refined physics-related trigger primitives, such as energy or direction of the final state particles in the detectors, therefore leading to a net improvement of trigger conditions and data handling.
GPU architectures have been designed to optimize computing throughput with no particular attention to their usage in real-time contexts, such as the one we are considering here.
While execution times are rather stable on these architectures, also data transfer tasks are to be taken into account: assessment of the real-time features of the whole system needs a careful characterization of all subsystems along data stream path, from detectors to GPU memories.
We identified the standard network subsystem as the main source of fluctuations for the total system latency. To address this problem, we designed and implemented two generations of FPGA-based Network Interface Cards, NaNet-1 and NaNet-10, supporting respectively 1GbE and 10GbE I/O channels.
To achieve a low and stable communication latency, NaNet design combines support for GPUDirect, i.e. the direct data transport between the GPU memory and the external I/O channels, with a network protocol offloading module implemented in the FPGA logic.
A GPU-based L0 trigger using NaNet is currently integrated in the experimental setup of the RICH Cherenkov detector of the NA62 experiment in order to reconstruct the ring-shaped hit patterns, and results obtained with this system will be reported and discussed.
This work is included in a broader project concerning the use of GPUs for advanced scientific computation in real-time applications, named GAP (GPU Application Project).