Speaker
Andrew Harmon Reis
(Southern Methodist University (US))
Description
From self-driving cars to particle physics, the uses of convolutional neural networks are plentiful. To greatly decrease inference latency, CNNs and other deep learning architectures can be deployed to hardware compute environments in the form of Field Programmable Gate Arrays (FPGAs). The open source package HLS4ML is leveraged to complete model conversion and RTL synthesis. The work presented here describes methods with which the generated Verilog/VHDL can be further optimized to yield further latency reductions and smaller hardware resource requirements.
Author
Andrew Harmon Reis
(Southern Methodist University (US))