20–23 May 2025
CERN
Europe/Zurich timezone
We published some of the talk schedule. Timetable is still **preliminary**, times are subject to change.

Distributed Arithmetic for Real-time Neural Networks on FPGAs

21 May 2025, 11:20
30m
500/1-001 - Main Auditorium (CERN)

500/1-001 - Main Auditorium

CERN

400
Show room on map
Algorithm implementation in HDL and HLS Algorithm Implementation

Speaker

Chang Sun (California Institute of Technology (US))

Description

Neural networks with a latency requirement at the order of $\mu$s, like the ones used at the CERN Large Hadron Colliders, are typically deployed on FPGAs fully unrolled. A bottleneck for deployment of such neural networks is area utilization, which is directly related to the constant matrix-vector multiplications (CMVM) performed in the networks. In this work, we implement an algorithm that optimizes the area consumption for such neural networks on FPGAs by performing the CMVMs with distributed arithmetic (DA) and integrate with the hls4ml library, a FOSS library for running real-time neural network inference on FPGAs. The optimized resource usage and latency are compared with the ones from the original hls4ml implementation on different networks. The results show that the proposed optimization can achieve a reduction of on-chip resource by up to a half in realistic quantized neural networks, while reducing the latency by up to 40\%, all while maintaining bit-accurate output values.

Talk's Q&A During the talk
Talk duration 20'+10'
Will you be able to present in person? Yes

Author

Chang Sun (California Institute of Technology (US))

Presentation materials

There are no materials yet.