# FPGA Implementation of the General Triplet Track Fit

Kadir Tastepe<sup>1</sup>, Sebastian Dittmeier<sup>1</sup>, Abhirikshma Nandi<sup>1</sup>, Christof Sauer<sup>1,2</sup>, André Schöning<sup>1</sup> UNIVERSITÄT **HEIDELBERG** ZUKUNFT **SEIT 1386** 

<sup>1</sup>Physikalisches Institut - Universität Heidelberg, <sup>2</sup>now at CERN Contact: tastepe@physi.uni-heidelberg.de

# Online Track Reconstruction in High Rate Experiments

Reconstruction of charged particle tracks in high-energy physics experiments is a computationally intensive task. Due to the increasing number of simultaneous particle collisions in future high-luminosity colliders, like HL-LHC and FCC-hh, the challenge of tracking and online event reconstruction becomes even more significant. This demands for innovative algorithms running on accelerated hardware, e.g. FPGAs, GPUs and ASICs.

The General Triplet Track Fit [1] is a novel parallelisable track-fitting algorithm that extends the MS(Multiple Scattering)-only fit [2] by including hit uncertainties, making it usable for high-energy collider experiments.

With their inherent **parallelism**, **power efficiency** and **reconfigurability**, FPGAs are becoming increasingly attractive as co-processors for large data centres, such as filter farms, to meet the challenges of increasing throughput and computational complexity. A preliminary FPGA implementation of the General Triplet Track Fit for future heterogeneous online farms is developed, using high-level synthesis on AMD FPGAs with algorithmic optimisations to exploit the device's potential fully.



# General Triplet Track Fit (GTTF)



(1) Accounts for all detector-specific information, e.g., B field, material budget. Local Fit is optional for filtering. (2) Detector independent; compute track parameters from triplet parameters.

Local processing of Triplets involves computationally complex trigonometric functions.



## **Optimization for Throughput**

The CPU-based GTTF algorithm is translated into HLS C++. The design is optimised by simplifying complex calculations and breaking them into smaller, reusable steps, which reduce logic depth and improve throughput. Loop optimisations (pipelining, unrolling, flattening, etc.) are applied to enhance performance and maximise throughput.



#### Events AMD Vitis HLS Toolflow [4] allows rapid design iterations and **GTTF on FPGA** ← CPU(double) resource-throughput optimisations. (Freg: 300 MHz, nEvent: 1, Entries: 10000) of ••• HW-EMU(float) NLayers=8, p\_=10 GeV, B=2 T, θ=70 Number [5 Source Code $10^{2}$ Software Emulation 10 Hardware Emulation

Calculated track parameters are consistent with CPU results.

## Summary

The GTTF is a potential track fit for future experiments. A preliminary high-throughput FPGA implementation is developed using AMD Vitis HLS. The synthesis results indicate that processing rates of approximately 5x107 local processings/s and 4x10<sup>7</sup> global fits/s for 8 layers with float precision can be achieved. Using approximations and trading off precision, further performance improvements are possible. Currently, various kernel models are tested if they increase performance. A resource-minimized version will also be implemented, allowing for optimisation based on application-specific needs.



## **Functional Verification**





Global Fit throughput decreases with an increasing number of layers, because of the matrix inversions. Local Processing Kernel scales with the number of triplets.



With an increase in the number of triplets, the size of the covariance matrix also grows, becoming more resource-intensive.

#### II: Initiation Interval

Clock Period: Duration of one clock cycle

NTriplets = NLayers - 2

Throughput: Number of executed kernels per unit time



#### GTTF on AMD Alveo U280



### [1] Schöning, A. (2024). A General Track Fit based on Triplets. https://arxiv.org/abs/2406.05240 [2] Berger, N., Kozlinskiy, A., Kiehn, M., & Schöning, A. (2017). A new three-dimensional track

fit with multiple scattering. <u>https://doi.org/10.1016/j.nima.2016.11.012</u>

[3] AMD. Alveo U280 Data Center Accelerator Card. AMD,

https://docs.amd.com/r/en-US/ug1314-alveo-u280-reconfig-accel/Introduction

[4] AMD. (2023.1). Vitis Application Development Flow User Guide (UG1393)

[5] Mouser: Acceleration RFSoC solutions with Vitis, https://www.mouser.hk/blog/accelerating-rfsoc-solutions-vitis