Oct 19 – 23, 2020
Europe/Zurich timezone

GPU and FPGA as a Service for Machine Learning Inference Accelerations

Oct 23, 2020, 3:15 PM
Lightning talk 6 ML infrastructure : Hardware and software for Machine Learning Workshop


Yu Lou (University of Washington (US))


The data rate may surge after some planned upgrades for the high-luminosity Large Hadron Collider (LHC) and accelerator-based neutrino experiments. Since there is no enough storage to save all of the data, there is a challenging demand to process and filter billions of events in real-time. Machine learning algorithms are becoming increasingly prevalent in the particle reconstruction pipeline. Specially designed hardware can significantly accelerate the machine learning inference time compared to CPUs. Thus, we propose a heterogeneous computing framework called the Services for Optimized Network Inference on Coprocessors (SONIC) to accelerate machine learning inferences with various coprocessors. With a unified interface, the framework conveniently provides GPU as a service, using either the Nvidia Triton framework or the Microsoft Brainwave service as the backend. It also features the first open-source FPGA-as-a-service toolkit, using either our hls4ml framework or the Xilinx ML Suite as the backend. We demonstrated that our method could speed up one classification and two regression problems in the LHC experiments and ProtoDUNE-SP. By providing coprocessors as a service, our work may assist various other computing workflows across science.

Primary authors

Yu Lou (University of Washington (US)) Javier Mauricio Duarte (Univ. of California San Diego (US)) Jeffrey Krupa Kelvin Lin (University of Washington (US)) Kevin Pedro (Fermi National Accelerator Lab. (US)) Dr Kyle Knoepfel (Fermi National Accelerator Laboratory) Maria Acosta Flechas (Fermi National Accelerator Lab. (US)) Matthew Trahms (UW ACME Lab) Mia Liu Michael Wang (Fermi National Accelerator Lab. (US)) Natchanon Suaysom (University of Washington (US)) Nhan Viet Tran (Fermi National Accelerator Lab. (US)) Philip Harris (Unknown) Scott Hauck (University of Washington) Shih-Chieh Hsu (University of Washington Seattle (US)) Ta-Wei Ho (National Tsing Hua University (TW)) Thomas Klijnsma (Fermi National Accelerator Lab. (US)) Tingjun Yang (Fermi National Accelerator Lab. (US)) Benjamin Hawks (Fermi National Accelerator Laboratory) Dr Burt Holzman (Fermi National Accelerator Lab. (US)) Dylan Sheldon Rankin (Massachusetts Inst. of Technology (US)) Jack Dinsmore

Presentation materials