



### Level-1 Tracking at CMS for the HL-LHC

Sara Fiorendi (University of Tennessee) on behalf of the CMS Collaboration

> Connecting the Dots 2023 October 10-13th Toulouse

### **HL-LHC opportunities and challenges**

The High-Lumi LHC will provide the experiments with unprecedented high statistics data

- extend discovery reach in searches for new physics & rare SM processes
- improve Higgs boson and SM precision measurements



This will happen in a **very challenging environment** for the experiments

- instantaneous luminosity of 5-7 x 10<sup>34</sup> cm<sup>-2</sup> s<sup>-1</sup>
- expected average pileup of 200, resulting increase of particle density
- radiation damage to the detector

Phase-II upgrades of the CMS detector were designed to maintain excellent detection ability, and even improve performance wrt current detector

• including tracking in hardware trigger plays a crucial role





#### Sanoni inggor Systemis

### • L1 trigger

- Hardware-based, implemented in custom-built electronics
- Muon & calorimeter information with reduced granularity, no tra



CMS trigger-upgrade



- Tracking information & full detector granularity
- ATLAS use level-2 & event filter, CMS single-step HLT
- The entire trigger system will be replaced for HL-LHC
- Still based on a 2-level trigger approach to reduce the 40MHz collision rate down to 7.5 kHz
  - hardware Level 1 (L1) trigger
  - software High Level Trigger (HLT)



- Significant challenge in data processing
  - huge amount of input data bandwidth (~63Tb/s)
  - decision window of 12.5µs (4µs for track reconstruction)
- Tracking information will be used for the first time at L1!
  - On-detector filtering to reduce hit rate
  - Off-detector track finding algorithm implemented on Xilinx FPGAs

### CMS L1 trigger scheme @HL-LHC



### **Benefits of tracking @L1**

- Usage of tracking information in hardware trigger allows to
  - improve  $p_T$  resolution and particle identification  $\rightarrow$  lower trigger thresholds
  - identify primary interaction vertex, mitigating the pileup effects
  - associate objects to a common vertex
  - perform **Particle Flow** reconstruction already at L1 (also thanks to the fine calorimeter granularity)



### **Phase 2 Outer Tracker**



- Entire tracker detector will be replaced during LS3
  - increased granularity and pseudo-rapidity acceptance, radiation tolerance, and lower mass
- Outer Tracker (OT) will consist of 6 barrel layers and 2 x 5 disks
  - tilted geometry for better trigger performance and reduction in number of modules
  - PS and 2S modules provide p<sub>T</sub> discrimination in front-end electronics through hit correlations between two closely spaced sensors

### **Tracker input to the L1 trigger**

• Two kinds of modules (PS and 2S) will be used in different regions of the detector



- Correlated pairs of clusters consistent with a  $p_T > 2$  GeV track form a **stub** 
  - input to the track finding algorithm
  - cut at 2 GeV will allow a factor ~10 data reduction



### L1 tracking system ov

- Extensive **parallel processing** to cope with high data rate and large combinatorics
  - takes advantage of natural detector segmentation (9 sectors in  $\phi$ )
    - further within-sector parallel processing dividing  $\phi$  into "virtual modules"
  - use of time-multiplexing (x18) to implement multiple identical
- Flexible and scalable architecture



ΔΨ (R=R

critical

radius

A. Hart

duplicated

 $p_T > 2 \text{ GeV}$ 

 $p_T < 2 \text{ GeV}$ 

Muon Track Finding

outer

inner

Eff by

unique

x Icm

# Track finding algorithm (1)

Road search algorithm based on tracklet seeds



### **Emulation** $\Leftrightarrow$ **Firn**



- Track parameters initially estimated from tracklet + beamspot constraint
  - only combinations with p<sub>T</sub> > 2 GeV kept oject tracklets to other layers & disks to search for matching stubs
- Search windows derived from
  Project potential track to other layers/disks accessible stubs within predefined narrow wiscowr ch
  - propagation both inward and outward Both inside-out & outside-in
  - minimum number of stubs required



## Track finding algorithm (2)

Duplicate removal and fitting

#### 4. Removal of duplicate tracks

- pattern recognition produces multiple track candidates per each charged particle
  - redundant seeds ensure high efficiency, but lead to duplicate tracks

litional duplicates may originate from combinatorial stubs

### Tracklet Based Track Finding Ls replice

replicated tracks are joined into a "merged" track candidate

Form track seeds, tracklets, from pairs of stubs in neighboring layers.
 iterative approach: starts with tracklet parameters & uncertainties, then track parameters
 stub pair

tracklet

#### Kalman Filter fitting





### **Expected performance**



- Expected tracking performance estimated on simulated events
  - high efficiency across  $\eta$  and  $p_T$
  - precise z<sub>0</sub> resolution (~1mm in the barrel), necessary for vertex association

### **Track quality**

- An additional track quality module will be run after the Kalman Filter step to reduce number of tracks not coming from genuine charged particles
- Using a ML approach to classify real/fake tracks, outperforms simple cut based selection (\*)
  - features from reconstructed track parameters:  $\phi$ ,  $\eta$ ,  $z_0$ ,  $n_{stub}$ ,  $n_{misslayer}$ ,  $\chi^2_{bend}$ ,  $\chi^2_{rz}$ ,  $\chi^2_{r\phi}$
  - GBDT chosen over NN as less FPGA-resource hungry



### Hardware platforms

• Hardware for track-finding based on ATCA platform (standard for HL-LHC upgrade)



#### **APOLLO: track finding processing boards**

- Service Module provides infrastructure components
- Flexibility via pluggable Command Module: contains two large FPGAs, optical fiber interfaces & memories



2022 JINST 17 C04033

ards)



### **Firmware implementation**

- Implemented as alternated processing and memory modules
- Multiple copies of each module run in parallel
- Seeding & propagation steps written using Xilinx Vivado HLS
- Memory modules, Kalman Filter and top level written in VHDL
- Targeting 240 MHz FPGA clock



### Narrow slice project

- End-to-end demonstration of the track finding chain on a narrow  $\boldsymbol{\varphi}$  slice
  - based only on one (barrel) seed
  - does not include the duplicate removal step
- Demonstrated on Apollo board rev1





Kalman • Filter

VU7P

### Full barrel project

- Seeding & stub matching in barrel layers, ~2/3 of the full project
  - implemented in single VU13P FPGA
    - final project will use two VU13P
  - meeting timing requirements was challenging
    - exploited machine learning based Vivado firmware implementation strategy
  - **floorplanning** to avoid signals crossings regions with dead silicon interconnections
  - using **combined modules** to reduce latency



• Currently working on integrating the full chain of modules for the entire detector

### Summary

- L1 track finding will be crucial @HL-LHC to maintain acceptable trigger rates while successfully pursuing CMS physics goals
- Main challenges related to the large combinatorics and latency
  - CMS will use a **unique detector design with p<sub>T</sub> modules** providing on-detector data filtering
  - extensive parallelisation being exploited for the off-detector track finding algorithm (on FPGAs)
- Current status:
  - reduced configuration firmware was successfully tested
  - **ongoing** work to integrate the **full chain** covering the entire detector on two FPGAs

### backup

### **Combined modules**

• Moving towards combined modules  $\rightarrow$  fewer processing modules help in reducing the latency



### **Displaced tracking**

- Extended tracking being studied in order to reconstruct trajectories not pointing to the PV
- Changes wrt baseline tracking algo impact:
  - seeding step: triplets instead of doublets + origin
  - Kalman filter: 5-parameter fit instead of 4-par. (+  $d_0$ )





## **Track quality**

• Resource usage for NN and GBDT

https://agenda.infn.it/event/28874/contributions/168841/attachments/93290/127232/ICHEP\_2022\_Poster.pdf

| Performance and         | Model | Python AUC | HLS AUC | Latency (clk) | LUT $\%$ | FF %  | DSP % |
|-------------------------|-------|------------|---------|---------------|----------|-------|-------|
| resource use for Xilinx | NN    | 0.985      | 0.982   | 8             | 0.104    | 0.029 | 0.292 |
| VU9P FPGA [3,4]:        | GBDT  | 0.986      | 0.981   | 3             | 0.140    | 0.027 | 0.0   |

 Performance on displaced tracks of the baseline GBDT, compared to a possible dedicated displaced GBDT

