Fast Machine Learning for Science Conference 2025

Name: Fast Machine Learning for Science Conference 2025
Start: 2025-09-01T08:30:00+02:00
End: 2025-09-05T17:30:00+02:00
Location: ETH Zurich

1–5 Sept 2025

ETH Zurich

Europe/Zurich timezone

Local organisers

fml-2025-organisers@cern.ch

End-to-End Neural Network Compression and Deployment for Hardware Acceleration Using PQuant and hls4ml

2 Sept 2025, 13:20

20m

ETH Zurich

HIT E 51, Siemens Auditorium, ETH Zurich, Hönggerberg campus, 8093 Zurich, Switzerland

Standard Talk Contributed talks

Roope Oskari Niemi

As the demand for efficient machine learning on resource-limited devices grows, model compression techniques like pruning and quantization have become increasingly vital. Despite their importance, these methods are typically developed in isolation, and while some libraries attempt to offer unified interfaces for compression, they often lack support for deployment tools such as hls4ml. To bridge this gap, we developed PQuant, a Python library designed to streamline the process of training and compressing machine learning models. PQuant offers a unified interface for applying a range of pruning and quantization techniques, catering to users with minimal background in compression while still providing detailed configuration options for advanced use. Notably, it features built-in compatibility with hls4ml, enabling seamless deployment of compressed models on FPGA-based accelerators. This makes PQuant a versatile resource for both researchers exploring compression strategies and developers targeting efficient implementation on edge devices or custom hardware platforms. We will present the PQuant library, the performance of several compression algorithms implemented with it, and demonstrate the conversion flow of a neural network model from an uncompressed state to optimized firmware for FPGAs.

Roope Oskari Niemi

Chang Sun (California Institute of Technology (US)) Anastasiia Petrovych (CERN) Enrico Lupi (CERN, INFN Padova (IT)) Dimitrios Danopoulos (CERN) Arghya Ranjan Das (Purdue University (US)) Sebastian Dittmeier (Ruprecht-Karls-Universitaet Heidelberg (DE)) Michael Kagan (SLAC National Accelerator Laboratory (US)) Miaoyuan Liu (Purdue University (US)) Vladimir Loncar (CERN)

PQuant_fastML.pdf

Fast Machine Learning for Science Conference 2025

Local organisers

End-to-End Neural Network Compression and Deployment for Hardware Acceleration Using PQuant and hls4ml

ETH Zurich

Speaker

Description

Author

Co-authors

Presentation materials

Choose timezone

Fast Machine Learning for Science Conference 2025

Local organisers

Speaker

Description

Author

Co-authors

Presentation materials