Fast Machine Learning for Science Conference 2024

Name: Fast Machine Learning for Science Conference 2024
Start: 2024-10-15T08:00:00-04:00
End: 2024-10-18T21:00:00-04:00
Location: Purdue University

15–18 Oct 2024

Purdue University

America/Indiana/Indianapolis timezone

rule4ml: An Open-Source Tool for Resource Utilization and Latency Estimation for ML Models on FPGA

17 Oct 2024, 13:10

15m

Steward Center 306 (Third floor) (Purdue University)

Steward Center 306 (Third floor)

Purdue University

128 Memorial Mall Dr, West Lafayette, IN 47907

Standard 15 min talk Contributed talks

Hamza Ezzaoui Rahali (University of Sherbrooke)

Deploying Machine Learning (ML) models on Field-Programmable Gate Arrays (FPGAs) is becoming increasingly popular across various domains as a low-latency and low-power solution that helps manage large data rates generated by continuously improving detectors. However, developing ML models for FPGA deployment is often hindered by the time-consuming synthesis procedure required to evaluate resource usage and latency. In particular, the synthesis has a chance of failing depending on the ML architecture, especially if the required resources exceed the capacity of the target FPGA, which in turn makes the development process slow and repetitive.

To accelerate this development, we introduce rule4ml, an open-source tool designed to predict the resource utilization and inference latency of Neural Networks (NNs) before their synthesis and implementation on FPGA. We leverage hls4ml, a framework that helps translate NNs into high-level synthesis (HLS) code, to synthesize a diverse dataset of NN architectures and train resource utilization and inference latency predictors. While hls4ml requires full synthesis to obtain resource and latency insights, our method uses trained regression models for immediate pre-synthesis predictions. The prediction models estimate key FPGA metrics, including the usage of Block RAM (BRAM), Digital Signal Processors (DSP), Flip-Flops (FF), and Look-Up Tables (LUT), as well as the inference clock cycles. Evaluation on both synthetic and benchmark NN architectures demonstrates high prediction accuracy, with R² scores between 0.8 and 0.98, and sMAPE values ranging from 10% to 30%.

This presentation will focus on introducing rule4ml, showcasing how this tool allows immediate assessment of the feasibility and performance of NNs on FPGAs. We will also explore the data generation and the regression models' training and validation, presenting the predictive performance and the current limitations of our approach as well as potential future improvements. By providing these insights, we aim to show how rule4ml can significantly streamline the deployment of ML models in real-time applications, ultimately reducing development time and enhancing productivity in workflows that rely on NN-to-FPGA frameworks such as hls4ml.

Mr Mohammad Mehdi Rahimifar (University of Sherbrooke)

Prof. Audrey Corbeil Therrien (University of Sherbrooke) Hamza Ezzaoui Rahali (University of Sherbrooke)

rule4ml_FastML2024.pptx

Fast Machine Learning for Science Conference 2024

rule4ml: An Open-Source Tool for Resource Utilization and Latency Estimation for ML Models on FPGA

Steward Center 306 (Third floor)

Purdue University

Speaker

Description

Author

Co-authors

Presentation materials

Choose timezone

Fast Machine Learning for Science Conference 2024

Speaker

Description

Author

Co-authors

Presentation materials