Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

15–18 Oct 2024
Purdue University
America/Indiana/Indianapolis timezone

wa-hls4ml: A benchmark and dataset for ML accelerator resource estimation

16 Oct 2024, 15:40
5m
Steward Center 306 (Third floor) (Purdue University)

Steward Center 306 (Third floor)

Purdue University

128 Memorial Mall Dr, West Lafayette, IN 47907
Lightning 5 min talk + poster Lighting talks

Speaker

Ben Hawks (Fermi National Accelerator Lab)

Description

As machine learning (ML) increasingly serves as a tool for addressing real-time challenges in scientific applications, the development of advanced tooling has significantly reduced the time required to iterate on various designs. Despite these advancements in areas that once posed major obstacles, newer challenges have emerged. For example, processes that were not previously considered bottlenecks, such as model synthesis, are now becoming limiting factors in the rapid iteration of designs. In an attempt to reduce these emerging constraints, multiple efforts have been launched towards designing a ML based surrogate model for resource estimation of synthesized accelerator architectures, which would help reduce the iteration time when attempting to design a solution within a set of given hardware constraints. This approach shows considerable potential, but as it stands, the effort is early and would benefit from coordination and standardization to assist future works as they emerge. In this work, we introduce wa-hls4ml, a ML accelerator resource estimation benchmark and corresponding dataset of 100,000+ synthesized dense neural networks. In addition to the resource utilization data provided in our dataset, we also offer the generated artifacts and logs for many of the synthesized neural networks with the intention to support future research in ML-based code generation. This benchmark evaluates performance against multiple common ML model architectures, primarily originating from scientific domains. The selected models are implemented through hls4ml on Xilinx FPGAs. We measure the performance of a given model through multiple metrics, including $R^2$ Score and SMAPE on regression tasks, as well as FLOPS and inference time to further characterize the estimator under test.

Primary authors

Audrey Corbeil Therrien (Universite de Sherbrooke (CA)) Ben Hawks (Fermi National Accelerator Lab) Dennis Plotnikov (Johns Hopkins University (US)) Giuseppe Di Guglielmo (Fermilab) Hamza Ezzaoui Rahali (University of Sherbrooke) Javier Mauricio Duarte (Univ. of California San Diego (US)) John Graham (UCSD) Dr Karla Tame-Narvaez (Fermilab National Accelerator Laboratory) Mohammad Mehdi Rahimifar (University of Sherbrooke) Nhan Tran (Fermi National Accelerator Lab. (US)) Vladimir Loncar (Massachusetts Inst. of Technology (US))

Presentation materials