November 30, 2020 to December 3, 2020
Southern Methodist University
America/Chicago timezone

Large and compressed Convolutional Neural Networks on FPGAs with hls4ml

Nov 30, 2020, 2:58 PM
Southern Methodist University

Southern Methodist University



Vladimir Loncar (CERN)


We present ultra low-latency Deep Neural Networks with large convolutional layers on FPGAs using the hls4ml library. Taking benchmark models trained on public datasets, we discuss various options to reduce the model size and, consequently, the FPGA resource consumption: pruning, quantization to fixed precision, and extreme quantization down to binary or ternary precision. We demonstrate how inference latencies of O(10) micro seconds can be obtained while high accuracy is maintained

Primary author

Presentation materials