ACAT 2025

Name: ACAT 2025
Start: 2025-09-08T08:00:00+02:00
End: 2025-09-12T16:30:00+02:00
Location: Hamburg, Germany

8–12 Sept 2025

Hamburg, Germany

Europe/Berlin timezone

Efficient Transformers for Jet Tagging

10 Sept 2025, 11:00

30m

ESA W 'West Wing'

Poster Track 1: Computing Technology for Physics Research Poster session with coffee break

Vivekanand Gyanchand Sahu (University of California San Diego)

We present a suite of optimizations to the Particle Transformer (ParT), a state-of-the-art model for jet tagging, targeting the stringent latency and memory constraints of real-time environments such as HL-LHC triggers. To address the quadratic scaling and compute bottlenecks of standard attention, we integrate FlashAttention for exact, fused-kernel attention with reduced memory I/O, and Linformer to lower attention complexity from O(n²) to O(n) via low-dimensional projections—substantially improving scalability for longer sequences. We further apply INT8 dynamic quantization to compress matrix multiplications, reducing latency and GPU memory usage without retraining. Evaluations on JetClass and HLS4ML datasets show that these techniques—individually and in combination—deliver significant inference speedups, FLOP reductions, and memory savings while maintaining near-baseline accuracy. Additional experiments explore sequence ordering strategies, including physics-motivated projection matrices, and employ interpretability analyses of attention maps and embeddings to better understand model behavior. The combined approach enables efficient, accurate transformer-based jet classification suitable for high-rate trigger systems.

References

Interpreting and Accelerating Transformers for Jet Tagging (Talk at FastML)

Significance

This presentation goes beyond a status update by showcasing novel integration of FlashAttention, Linformer, and INT8 quantization in the Particle Transformer (ParT) for jet classification. It highlights the synergistic impact of these optimizations on reducing inference time and memory usage without sacrificing accuracy. By systematically evaluating their combined effects, the work provides practical insights for real-time deployment in HL-LHC triggers, marking a significant step toward production-ready transformer models in high-energy physics.

Experiment context, if any	CMS

Aaron Wang (University of Illinois Chicago (US)) Abhijith Gandrakota (Fermi National Accelerator Lab. (US)) Elham Khoda (University of Washington (US)) Javier Mauricio Duarte (Univ. of California San Diego (US)) Jennifer Ngadiuba (FNAL) Vivekanand Gyanchand Sahu (University of California San Diego) Zihan Zhao (Univ. of California San Diego (US))

There are no materials yet.

ACAT 2025

Efficient Transformers for Jet Tagging

ESA W 'West Wing'

Speaker

Description

References

Significance

Authors

Presentation materials

Choose timezone

ACAT 2025

Speaker

Description

References

Significance

Authors

Presentation materials