Speaker
Description
Our work aims at improving the performances of the NA62 low-level trigger implementing a real-time stream processing architecture based on an orchestrated combination of heterogeneous computing devices (CPUs, FPGAs and GPUs).
To enable it we devised NaNet, a FPGA-based PCI-Express Network Interface Card with processing and GPUDirect capabilities, which supports multiple link technologies (1/10/40GbE and custom ones).
We have demonstrated the effectiveness of the method by retrofitting the RICH detector to generate refined physics-related primitives.
Results obtained during the first months of 2017 run are presented and discussed, along with a description of the latest developments in the NaNet architecture.
Summary
Over the last few years several works have demonstrated the effectiveness of the integration of GPU-based systems in high level trigger of different experiments.
On the other hand the use of GPUs in the low level trigger systems, characterized by stringent real-time constraints, such as tight time budget and high throughput, poses several challenges.
In the NA62 experiment at CERN streams of raw data primitives produced in the detectors are transmitted to a centralized processing system that is in charge of generating the low level trigger signal within 1 ms time budget.
Our approach aims at improving the low level trigger performances distributing this processing over the whole chain starting from the earliest possible stages, i.e. the detectors, by operating in real-time on the data streams with an orchestrated combination of heterogeneous computing devices (CPUs, FPGAs and GPUs).
To enable such distributed real-time computing architecture we devised NaNet, a FPGA-based PCI Express Network Interface Card with processing and GPUDirect capabilities, which supports multiple link technologies (1/10/40GbE and custom ones).
We have demonstrated the effectiveness of the method by harvesting the computing power of last generation nVIDIA Pascal GPUs and of the FPGA hosted by NaNet to build in real-time refined physics-related primitives for the RICH detector, as the the knowledge of Cerenkov rings parameters allows to build more stringent conditions for data selection at low trigger level (L0). Indeed the refined RICH primitives can be also profitably employed in the software trigger levels.
In the standard configuration the online PC farm, devoted to process the events in two steps (L1 and L2) in order to decide if the events are interesting for permanent storage, do not build rings starting from the RICH raw data, due to limitation in computing power. We are working to send this information also to the PC farm for the required events, under a rate of 100 kHz and a latency small enough to accomplish all the requests before the start of the next burst (order of 5 s).
We believe that this will give a not negligible contribution to the total rejection and that the physics potential of the experiment will benefit from it. Results obtained during the first months of 2017 NA62 run are presented and discussed, along with a detailed
description of tne latest developments in the NaNet architecture.