NGT Tutorials: PQuantML

Europe/Zurich
40/S2-B01 - Salle Bohr (CERN)

40/S2-B01 - Salle Bohr

CERN

100
Show room on map
Description

This tutorial introduces PQuantML, a practical framework for pruning and quantization-aware training (QAT) of deep neural networks. It walks users through the core concepts behind model compression—why pruning redundant weights and training with low-precision arithmetic can dramatically reduce model size, latency, and energy consumption without sacrificing accuracy. The tutorial explains how PQuantML integrates seamlessly into a typical training pipeline, highlighting its modular design and clear APIs that make it easy to experiment with different compression strategies across common neural network architectures.

Through step-by-step examples, the tutorial demonstrates how to configure structured and unstructured pruning, enable quantization-aware training, and fine-tune compressed models to recover performance. Users learn how to monitor sparsity, accuracy, and computational efficiency throughout training, and how to export optimized models for deployment on resource-constrained hardware. By the end, readers will have a solid understanding of how to use PQuantML to build efficient, production-ready neural networks while maintaining strong predictive performance.

Zoom Meeting ID
67588057610
Host
Maurizio Pierini
Useful links
Join via phone
Zoom URL
    • 09:00 09:45
      Introduction to PQuantML 45m
      Speaker: Roope Oskari Niemi
    • 09:45 10:10
      Break 25m
    • 10:10 12:10
      Hands-on Tutorial 2h
      Speaker: Roope Oskari Niemi