26–30 Sept 2016
Karlsruhe Institute of Technology (KIT)
Europe/Zurich timezone

ProtoPRM: An FPGA-Based High Performance Associative Memory Pattern Recognition Mezzanine

29 Sept 2016, 12:00
25m
Redtenbacher Lecture Hall (Building 10.91)

Redtenbacher Lecture Hall (Building 10.91)

Oral Trigger Trigger

Speaker

Jamieson Olsen (Fermi National Accelerator Lab. (US))

Description

Pattern recognition associative memory (PRAM) devices are parallel processing engines which are used to tackle the complex combinatorics of track finding algorithms, particularly for silicon based tracking triggers. In the talk we present our latest PRAM-based pattern recognition mezzanine card design which supports both ASIC and FPGA based PRAMs, and describe how the PRAM interface and FPGA firmware modules work in conjunction to implement a high performance fully pipelined low latency track finding engine. This work is part of the overall program for Level-1 silicon-based tracking trigger generic R&D for high luminosity LHC.

Summary

PRAM development has been limited mostly to the realm of ASICs, often a lengthy and expensive process. Field Programmable Gate Arrays, however, allow for quick iterations and low cost design cycles, making them an ideal hardware platform for designing and evaluating new PRAM features before committing them to silicon. At the functional level, our ASIC and FPGA based PRAM designs match closely. Input data is divided into six detector layers, and the coarse information of these layer hits are written to the PRAM device, which is programmed to look for matching patterns. The addresses of the matched patterns are then read out of the PRAM sequentially. Both the ASIC and FPGA PRAM designs are fully pipelined and allow for concurrent read-in (of current event) and read-out (of the previous event) operations. This two stage pipelined architecture offers important performance benefits.

It is in the implementation details, however, that the differences between the ASIC and FPGA PRAM designs become apparent. In our experience we have found that logic blocks which have been optimized for fine-grain ASIC architectures do not implement efficiently in coarse-grain FPGA logic cells without significant redesign. In particular, the pattern storage elements and backend sorting logic was completely redesigned to fit efficiently into Kintex UltraScale FPGAs while still retaining cycle-accurate emulation of our PRAM ASIC design.

Our prototype Pattern Recognition Mezzanine (protoPRM) board is a high performance track finding engine implemented using two Kintex UltraScale FPGAs. On this board the slave FPGA emulates the PRAM, as described above. The master FPGA is used to format and store input data, which is then used in conjunction with the PRAM output to find and fit tracks. First, incoming detector layer hits (called stubs) are remapped from a local to global coordinate system. From these global stubs coarse resolution super-strips (SSIDs) are generated and sent over a local bus (8 x up to16.3Gbps MGT lanes) to the slave FPGA or PRAM, which outputs found patterns (called roads). While the PRAM is processing SSIDs, the full resolution stubs are stored in a database, called the data organizer. The data organizer is a new design which has been optimized for UltraScale BlockRAMs and completely eliminates intermediate address pointer structures. The result is a high performance, low latency, fully pipelined data structure which presents a FIFO-like write interface and reads out like RAM. Stubs recalled from the data organizer are now organized in terms of “hits of interests” or roads and are then sent to downstream stage for track fitting.

Firmware modules throughout the protoPRM have been significantly redesigned to reduce latency and support the pipelined PRAM interface. The protoPRM board will be used in our L1 tracking trigger demonstration system to not only provide the needed “proof of principle” demonstration, but also to corner the PRAM interface performance specifications which will guide future PRAM ASIC designs.

Primary authors

Jamieson Olsen (Fermi National Accelerator Lab. (US)) Jinyuan Wu (Fermi National Accelerator Lab. (US)) Tiehui Ted Liu (Fermi National Accelerator Lab. (US)) Zhen Hu (Fermi National Accelerator Lab. (US)) Zijun Xu (Peking University (CN))

Presentation materials