hls4ml @ Snowmass CSS 2022: Tutorial

329 Sieg Building (SIG) (University of Washington, Seattle)

329 Sieg Building (SIG)

University of Washington, Seattle

1851 NE Grant Ln, Seattle, WA 98195
Elham E Khoda (University of Washington (US)), Daniel Diaz (Univ. of California San Diego (US)), Melissa Kathryn Quinnan (Univ. of California Santa Barbara (US)), Raghav Kansal (Univ. of California San Diego (US)), Javier Mauricio Duarte (Univ. of California San Diego (US))

With edge computing, real-time inference of deep neural networks (DNNs) on custom hardware has become increasingly relevant. Smartphone companies are incorporating Artificial Intelligence (AI) chips in their design for on-device inference to improve user experience and tighten data security. On the other hand, the autonomous vehicle industry is turning to application-specific integrated circuits (ASICs) to keep the latency low.


While the typically acceptable latency for real-time inference in applications like those above is O(1) ms, other applications require sub-microsecond inference. For instance, high-frequency trading machine learning (ML) algorithms are running on field-programmable gate arrays (FPGAs), highly accurate devices, to make decisions within nanoseconds. At the extreme inference spectrum end of both the low-latency (as in high-frequency trading) and limited-area (as in smartphone applications) is the processing of data from proton-proton collisions at the Large Hadron Collider (LHC) at CERN. Here, latencies of O(1) microsecond is required and resources are strictly limited.


In this tutorial, you will get familiar with the hls4ml library. This library converts pre-trained Machine Learning models into FPGA firmware, targeting extreme low-latency inference to stay within the strict constraints imposed by the CERN particle detectors. You will learn techniques for model compression, including how to reduce the footprint of your model using state-of-the-art techniques such as quantization. Finally, you will learn how to synthesize your model for implementation on the chip. Familiarity with Machine Learning using Python and Keras is beneficial for participating in this tutorial but not required.


This tutorial will take place in hybrid format, in Room 329 of the Sieg Building and with an alternative remote connection via ZOOM. The in-person venue has space limited to 30 participants that will be distributed on a first-come-first-serve registration basis. Once the 30 spots are filled, further registrants will still have the option to connect and participate remotely. 

ZOOM connection information can be found here.

Join Zoom Meeting

Meeting ID: 667 5799 1719
Passcode: 07812977

hls4ml tutorial at Snowmass CSS 2022
    • 1
      hls4ml tutorial
      Speakers: Daniel Diaz (Univ. of California San Diego (US)), Elham E Khoda (University of Washington (US)), Javier Mauricio Duarte (Univ. of California San Diego (US)), Melissa Kathryn Quinnan (Univ. of California Santa Barbara (US)), Raghav Kansal (Univ. of California San Diego (US))