30 November 2020 to 3 December 2020
Southern Methodist University
America/Chicago timezone

A OneAPI backend of hls4ml to speed up Neural Network inference on CPUs

30 Nov 2020, 15:30
6m
Southern Methodist University

Southern Methodist University

Talk

Speaker

Vladimir Loncar (CERN)

Description

A recent effort to explore a neural network inference in FPGAs using High-Level Synthesis language (HLS), focusing on low-latency applications in triggering subsystems of the LHC, resulted in a framework called hls4ml. Deep Learning model converted to HLS using the hls4ml framework can be executed on CPUs, but have subpar performance. We present an extension of hls4ml using the new Intel oneAPI toolkit that converts deep learning models into high-performance Data Parallel C++ optimized for Intel x86 CPUs. We show that inference time on Intel CPUs is improved hundreds of times over previous HLS-based implementation, and several times over unmodified Keras/TensorFlow.

Primary author

Presentation materials