25–28 Sept 2023
Imperial College London
Europe/London timezone

Efficient sparse matrix multiplication in hls4ml

Not scheduled
5m
Blackett Laboratory, Lecture Theatre 1 (Imperial College London)

Blackett Laboratory, Lecture Theatre 1

Imperial College London

Blackett Laboratory
Lightning Talk Contributed Talks Contributed Talks

Speaker

Duc Minh Hoang (MIT)

Description

Pruning enhances neural network hardware efficiency by zeroing out weight magnitude. In order to take full advantage of pruning, efficient implementations of sparse matrix multiplication are required. The current hls4ml implementations of sparse matrix multiplication rely on either the built in high-level synthesis zero suppression operations or a coordinate list representation, which faces scalability issues with model size and reuse factor. These implementations, particularly the coordinate list representation, are limited by their need to have large amounts of fanouts within an FPGA or ASIC to ensure a fully flexible implementation. We introduce a new implementation that preserves coordinate information but avoids the large dedicated logic needed for fanouts through the use of a crossbar. We present results for FPGA implementations scanning the model sparsity and initiation intervals for multiple benchmark models in MLPerf Inference Benchmark for anomaly detection and image classification.

Author

Co-author

Philip Coleman Harris (Massachusetts Inst. of Technology (US))

Presentation materials

There are no materials yet.