ACAT 2024

Name: ACAT 2024
Start: 2024-03-11T08:00:00-04:00
End: 2024-03-15T14:30:00-04:00
Location: Charles B. Wang Center, Stony Brook University

11–15 Mar 2024

Charles B. Wang Center, Stony Brook University

US/Eastern timezone

Contact

acat-loc2024@cern.ch

Optimizing ANN-Based Triggering for BSM events with Knowledge Distillation

12 Mar 2024, 11:30

20m

Theatre ( Charles B. Wang Center, Stony Brook University )

Theatre

Charles B. Wang Center, Stony Brook University

100 Circle Rd, Stony Brook, NY 11794

Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Marco Lorusso (Universita Di Bologna (IT))

In recent years, the scope of applications for Machine Learning, particularly Artificial Neural Network algorithms, has experienced an exponential expansion. This surge in versatility has uncovered new and promising avenues for enhancing data analysis in experiments conducted at the Large Hadron Collider at CERN. The integration of these advanced techniques has demonstrated considerable potential for elevating the efficiency and efficacy of data processing in this experimental setting.
Nevertheless, one frequently overlooked aspect of utilizing Artificial Neural Networks (ANNs) revolves around the imperative of efficiently processing data for online applications. This becomes particularly crucial when exploring innovative methods for selecting intriguing events at the trigger level, as seen in the pursuit of Beyond Standard Model (BSM) events. The study delves into the potential of Autoencoders (AEs), an unbiased algorithm capable of event selection based on abnormality without relying on theoretical priors. However, the distinctive latency and energy constraints within the Level-1 Trigger domain necessitate tailored software development and deployment strategies. These strategies aim to optimize the utilization of on-site hardware, with a specific focus on Field-Programmable Gate Arrays (FPGAs).
This is why a technique called Knowledge Distillation (KD) is studied in this work. It consists in using a large and well trained “teacher”, like the aforementioned AE, to train a much smaller student model which can be easily implemented on an FPGA. The optimization of this distillation process involves exploring different aspects, such as the architecture of the student and the quantization of weights and biases, with a strategic approach that includes hyperparameter searches to find the best compromise between accuracy, latency and hardware footprint.
The strategy followed to distill the teacher model will be presented, together with consideration on the difference in performance of applying the quantization before or after the best student model has been found. Finally, a second way to perform KD will be introduced called co-training distillation which sees the teacher and the student models trained at the same time.

Experiment context, if any	CMS experiment

Marco Lorusso (Universita Di Bologna (IT))

Acat__2024_Lorusso.pdf

ACAT_2024_Paper_Lorusso.pdf

ACAT 2024

Contact

Optimizing ANN-Based Triggering for BSM events with Knowledge Distillation

Theatre

Charles B. Wang Center, Stony Brook University

Speaker

Description

Author

Presentation materials

Peer reviewing

Paper

Choose timezone

ACAT 2024

Contact

Speaker

Description

Author

Presentation materials

Peer reviewing

Paper