



4000 3500 3000 2500 2000 1500 1000 500

In the next years High-Luminosity LHC is planned to go through several technological upgrades. The proton-proton collisions expected are targeted to increase from 80 to 140 pile-up.



This new physics environment will lead to upgrades for the CERN experiments. For example the ATLAS (A Toroidal LHC ApparatuS) experiment will upgrade its detector and consequently the Trigger and Data Acquisition system, to reach 10 kHz of output data stream with respect to the current 1.5 kHz.





ATLAS is studying the performance of the Hough Transform (HT) tracking algorithm to use it for the future Inner Tracker detector. To exploit it, the ATLAS environment requires that the position of the particle be defined with the polar coordinates radius "r" and azimuth angle "\op".



A flexible HT implemented on FPGA has been developed to propose as a candidate for (raw) particle tracking for filtering. The FPGA boards used for implementation tests are Xilinx commercial demonstrators VC709 and VCU1525. Studies of implementation on the Alveo U250 are on-going.



Flexible Hough Transform

Goal: Versatile HT that targets computational time performance versus occupied resources through high parallelization and high clock rate

HT operations: ··\*· Other Signal • Accumulator filling using the original HT formula in parallel across all incoming  $\times 10^{-1}$  $\phi_{0} = \phi_{+}$ bins. 11/2  $= \phi +$ .0 0.5 -



**Event Filter** 

Tracking



• Extraction of candidate tracks from the accumulator by applying the original HT formula "again" across all event clusters in parallel.

The developed architecture allows to decide which formula of the HT utilize depending on the best performance achieved or the requirements to reach:

•  $qA/p_t = (\phi_0 - \phi) / r;$ •  $\phi_0 = \dot{\phi} + (\dot{r} * qA/p_t);$ 

where  $p_t$  represents the particle momentum and  $\phi_0$  the azimuth angle of the track. The selection of the candidate tracks ("road") is done by overlapping a minimum amount of lines drawn in the accumulator to reach a required number of layers.



The left table shows the implementation results for

Worst Negative Slack (WNS):

Total Negative Slack (TNS):

Number of Failing Endpoints:

Total Number of Endpoints:

Region η μ

## ALGORITHM AND FIRMWARE PERFORMANCE

Resource

road activation of 7 layers.

| LUT    | 290023 | 1728000 | 16.78 |
|--------|--------|---------|-------|
| LUTRAM | 11754  | 791040  | 1.49  |
| FF     | 649060 | 3456000 | 18.78 |
| BRAM   | 547    | 2688    | 20.35 |
| DSP    | 2184   | 12288   | 17.77 |
| 10     | 2      | 676     | 0.30  |
| BUFG   | 22     | 1344    | 1.64  |
| ММСМ   | 5      | 16      | 31.25 |

The versatility of the proposal would allow to tune the algorithm

depending from the necessities. The results shown here are for an

accumulator of 168 bins alongside  $qA/p_t$  and 48 bins alongside  $\phi_0$ ,

considering to use 8 layers of the 13 available with layer threshold for the

Utilization

Available

Utilization %

the FPGA card Alveo U250. It includes the resources occupied. Below the image showing the matching of the timing constraints for a **frequency of 400 MHz**. The estimated processing times for the accumulator building and the cluster extrapolation depend on the average amount of clusters in input for the most populated layer and on the average number of roads extracted per event:

0.01 ns

1367922

0 ns

> 90 % 0.1 - 0.3 > 96.5 % 0.7 - 0.9 > 97 % > 82 %

The table above summarizes the performance from internal software analysis using events compatible with ATLAS simulation. Two barrel regions of the ITk detecto have been studiedr: the region in  $\eta$  [0.1:0.3] and [0.7:0.9] for the same  $\phi$  region [0.3:0.5] rad. These results are related to all the range of momentum studied in the  $qA/p_t$  range [-1.0572 : 1.0572] (A = 0.0003 GeV mm<sup>-1</sup>). Muons and pions tracking were studied. The performance represent the percentage of truth tracks found.

JU

## Conclusions and Future Steps

• 3000 ns;

• 2700-4500 ns;

The HEP experiments will face several challenges in the next years. Fast tracking in small trigger windows is an important requirement for the future. The Bologna group is developing an FPGA implementation of the Hough Transform algorithm to propose as feasible and performing solution for fast tracking HEP experiments. Firmware design has been defined and consolidated, capable to run at 400 MHz with compact resources utilization. The preliminary physics analysis studies are promising suggesting the possibility to achieve interesting performance. The next steps for the future months will be focused on the completion of the full detector performance.

IPRD 2023, Siena, Italy