6–10 Oct 2025
Rethymno, Crete, Greece
Europe/Athens timezone

Implementation of ML based conditions for the Phase-2 CMS Global Trigger upgrade

Not scheduled
20m
Rethymno, Crete, Greece

Rethymno, Crete, Greece

Aquila Rithimna Beach Crete, Greece
Poster Timing and Trigger Distribution Trigger

Speaker

Gabriele Bortolato (Universita e INFN, Padova (IT))

Description

The new CMS trigger system for the High-Luminosity LHC upgrade will exploit detailed information from the sub- detectors at the bunch crossing rate, allowing the Global Trigger (GT) FPGA firmware to use high-precision trigger objects. The GT will contain novel algorithms based on machine learning techniques such as Deep Neural Networks and Boosted Decision Trees to reach higher selection efficiency on particular event signatures. This study focuses on optimizing these models through techniques like quantization and pruning, with an emphasis on integrating high-level features, such as invariant mass and ∆R, already computed in firmware, into the existing ML models.

Summary (500 words)

The new CMS Level-1 trigger system for the High-Luminosity LHC upgrade will exploit detailed information from the calorimeter, muon and tracker trigger paths at the bunch crossing rate. The final stage of CMS Level-1 Trigger chain, the Global Trigger (GT), will receive high-precision trigger objects from the upstream systems. It will evaluate a menu of more than 1000 cut-based and ML classifier trigger algorithms in order to determine the Level-1 trigger accept decision. Traditionally, algorithms used to build the so called trigger menu have employed selections on one or more physics objects for instance cutting on a specific or combination of reconstructed particle properties. The Phase-2 CMS GT aims to go beyond this approach and include Machine Learning based conditions alongside the algorithms in use today in Run-3 in order to reach higher selection efficiency and selection of unexpected signals. The upgrade aims for a total latency of approximately 1 μs (about 40 bunch crossings) for the entire Global Trigger (GT). Due to two-layer GT architecture and firmware infrastructure, only around 200 ns of this latency budget is available for machine learning (ML) models, implying that model optimizations are essential to meet the target latency. ML-based conditions, particularly neural networks which are typically resource intensive must be optimized both during and after training to enable integration alongside a large number of cut-based algorithms. Two types of ML binary classifiers are being considered: Deep Neural Networks (DNNs) and Boosted Decision Trees (BDTs), the latter was recently added thanks to its exceptionally small resource footprint and latency. The Hls4ml tool has been employed to convert a Tensorflow/Keras DNN models into hardware description language like VHDL or Verilog, while the BDT translation is done with the Conifer python library. To reduce the model resource footprint and latency, multiple optimizations have been applied: variable and synapse pruning, hyper-parameter quantization and precision tuning. The input pre-processing is entirely done in firmware and it consists of multiple steps: interface to the already computed high-level features (invariant mass and ∆R), normalize the input distributions, de-serialization and input packaging. The pre-processing is developed to use the least amount of resources while maintaining a reasonable latency. A prototype firmware has been implemented and evaluated targeting a Serenity board equipped with a Virtex Ultrascale+ FPGA part. It contains various models that differ in the optimization techniques applied and target signal signature, for instance di-Higgs with different final states. The firmware encompasses DNNs, BDTs, the entire GT firmware infrastructure (including the I/O logic, de-multiplexers, and object distribution) and the interfaces necessary for the communication between the ML conditions and the GT framework. GT objects and high-level features are streamed at a frequency of 480 MHz, whereas the ML conditions accept a single vector of objects every 25 ns (40MHz). The models will run at 240 MHz with an initiation interval (II) of 1.

Authors

Dinyar Rabady (CERN) Gabriele Bortolato (Universita e INFN, Padova (IT)) Hannes Sakulin (CERN) Olga Zormpa (CERN)

Presentation materials

There are no materials yet.