# Testbeam results of the first real time embedded tracking system with artificial retina

A. Abba<sup>1,#</sup>, F. Caponio<sup>1,#</sup>, M. Citterio<sup>1</sup>, S. Coelli<sup>1</sup>, J. Fu<sup>1,2</sup>, A. Merli<sup>1,2</sup>, M. Monti<sup>1</sup>, <u>N. Neri<sup>1</sup></u>, M. Petruzzo<sup>1,2</sup>

for the INFN-RETINA collaboration

<sup>1</sup> INFN- Sezione di Milano, <sup>2</sup> Università di Milano

# now at Nuclear Instruments



VCI 2016 Vienna, Austria 15 -19 February 2016

## Outline

- Real time tracking
- Artificial retina algorithm and its implementation in hardware
- Detector prototype with embedded tracking capabilities
- Testbeam results
- Perspectives
- Summary

# Existing fast track finders

- Track pattern recognition without combinatorics
  - parallel matching of hits to precalculated track patterns, track parameters from linearised fit
  - use custom ASICs: Associative Memory (AM), based on contentaddressable memory (CAM)
- First use in CDF experiment: SVT, latency 10µs and input rate 30 kHz
- FTK device in ATLAS use similar concept. Latency ~50µs and input rate 100 kHz





## Real time tracking for HL-LHC

- Full exploitation of high luminosity LHC (HL-LHC) requires new detectors and trigger systems
- L1 trigger decisions based on tracking information are crucial:
  - reduce data rate to a sustainable level
  - maintain good efficiency and purity for signal events
- Real time tracking is extremely challenging at LHC: 40MHz throughput, large flow of data Tbit/s, short latency ≤1µs
- Necessary to find innovative solutions



# Artificial retina algorithm

- Basic algorithm for fast track finding
- L. Ristori, "An artificial retina for real-time track finding" [NIM A453 (2000) 425-429]



- Inspired by mechanism of visual receptive fields
  - massive parallelisation and analog response of track receptors (R)
  - pattern recognition and track fit by interpolation of R values

### Track identified by retina algorithm



## Artificial retina architecture

#### Three main blocks:

- Switch: delivers in parallel the hits from the detectors to only appropriate cellular units
- Engine: block of cellular units for parallel calculation of the weights
- Track fit: interpolation of adjacent cell weights for track parameter determination

#### Main differences with AM approach:

- only relevant data reach the processing units (engines). Data processing starts already in the switch while data is transmitted
- retina algorithm provides analog response contrarily to AM "yes/no" pattern matching





# Retina INFN project

- INFN-Retina R&D project started in 2015. Milano and Pisa groups involved
- Develop hardware prototype of a real time tracking device for intensive tracking applications (1-100 Giga tracks/sec), *e.g.* HL-LHC experiments
- Main deliverables:
- Real time tracking detector prototype for test beam (main subject of this talk)
- Fast track finding system compatible with large DAQ framework for test with simulated data at 40 MHz event rate (next step)



## Detector prototype with embedded tracking capabilities



## Real time tracking prototype



- Practical demonstrator: 8layer tracking prototype. Single-sided silicon strip detectors, 183 µm pitch
- Custom DAQ board: Retina architecture implemented in last generation FPGA
- Test full tracking system chain using 180 GeV/c proton beam at CERN SPS
- Device response reproduced with high level and low level simulations and studied using data

INFN

## Data acquisition system



- 4 Beetle chips for each detector
- Readout rate 300 kHz (1x mode). [Max rate 1.1 MHz (4x mode)]
- Digitalisation with multichannel 12-bit ADC and zero suppression (threshold comparator)
- Data output to disk using fast USB3 port

## Mamba board

 Readout 8 detectors in 1x mode (300 kHz) or 2 detectors in 4x mode (1.1 MHz)



## Artificial retina architecture



INFN

### Switch network: a 4x16 way dispatcher



32 engines each output = 32x16=512 engines

- Each box is a programmable two way sorter
- Each input can be delivered to the left, to the right or both output ports according to LUT information
- The 2-way sorter acts also as a memory buffer in case of traffic jam. Input stream can be held





# Cellular engine



Clocked pipeline divided in 4 stages:

- 1)  $S_i$  = hit-receptor distance 2)  $|s_i|$  absolute value 3)  $\exp(-s_i^2/2\sigma^2)$  tabulated in 10 bit (in) x 16 bit (out) LUT 4)  $R = \sum_i R_i$  sum of weights
- 8 accumulators for hits arriving at different times (useful at hight event rates)
- Engines output values after
  EndEvent signal has arrived
  to accumulator

INFN

## Track Fitter



- Identify in parallel local maximum weight above threshold and send relevant data to interpolation unit
- Data bandwidth now reduced by at least a factor 8
- Fan In: 32 engines connected to 1 interpolation unit
- Parabolic interpolation for extracting track parameters

 $\delta$ 

$$x = \frac{\Delta}{2} \frac{R_{+} - R_{-}}{2R_{0} - R_{-} - R_{+}}$$

### Resources and latency



# Simulation results

- Response simulated with ModelSim using single-track events
- Residual distribution of  $x_{-}$ ,  $x_{+}$ track parameters: retina generated tracks



VCI 2016



### Testbeam results



## Telescope on beam at SPS

- Telescope tested on 180 GeV/c proton beam
- ▶ Rotation angle wrt beam axis: 0, 2, 4, 8, 16, 20 degree

#### Telescope aligned with beam axis



Telescope rotated wrt beam axis





### Data results



Nicola Neri

INFN

21

VCI 2016

### Retina response vs track angle



Track angle: 2 degree



Track angle: 16 degree



Track angle: 4 degree



Track angle: 20 degree



-2

4 x. (cm)

2

### Data/simulation comparison

- Track parameter distribution determined by artificial retina algorithm
  - testbeam data processed by mamba board (retina) and verified using the artificial retina simulated response (MC retina)



### Track residuals: offline - retina

 It works! Offline-Retina track parameter residual are peaked at zero



### Track residuals: offline - retina

 Detector alignment constants can be fed into the firmware at run time. Sizeable improvement in residuals



INFN

### Perspectives: 4D fast track finding

- R&D on ultrafast silicon pixel detectors aims to achieve 10-20 ps time resolution JINST 9 (2014) C02001
- Hit time information can be used to further suppress noise hits

$$W_{ij} = \sum_{k} \exp\left(-\frac{s_{ijk}^2}{2\sigma^2}\right) \exp\left(-\frac{t_{ijk}^2}{2\sigma_t^2}\right) \qquad t_{ijk} = (t_{k,meas} - t_{ijk,exp})$$

26

#### Retina with spatial information







### Using precise time information of the hit

- R&D on ultrafast silicon pixel detectors aims to achieve 10-20 ps time resolution JINST 9 (2014) C02001
- Hit time information can be used to further suppress noise hits

$$W_{ij} = \left[\sum_{k} \exp\left(-\frac{s_{ijk}^2}{2\sigma^2}\right) \exp\left(-\frac{t_{ijk}^2}{2\sigma_t^2}\right)\right]$$

4D fast track finding system arXiv:1512.09008

time resolution 100 ps



Retina with spatial information and time information



noise hit out of time

27

- fake track
- real track

INFN

### Using precise time information of the hit

R&D on ultrafast silicon pixel detectors aims to achieve 10-20 ps time resolution JINST 9 (2014) C02001

noise hit out of time

fake track

real track

Hit time information can be used to further suppress noise hits

$$W_{ij} = \left[\sum_{k} \exp\left(-\frac{s_{ijk}^2}{2\sigma^2}\right) \exp\left(-\frac{t_{ijk}^2}{2\sigma_t^2}\right)\right]$$

Determine time of the track arXiv:1512.09008

time resolution 10 ps



Retina with spatial information and time information





### Evaluation of the time of the track

Time of the track determined by interpolating retina response at 3 different pre-computed times: T<sub>0</sub>-ΔT, T<sub>0</sub>, T<sub>0</sub>+ΔT.
 T<sub>0</sub>= nominal bunch crossing time, ΔT= tuned for optimal response



- Determination of the time of the track with few ps precision is possible
- Resolution scales as  $\frac{\sigma_t}{\sqrt{N_{hit}}}$  where  $\sigma_t$  is the hit time resolution

## Summary

- First realtime tracking system based on artificial retina algorithm tested successfully on beam at the CERN SPS (180 GeV/c protons, track rate about 300 kHz)
- Online retina track parameters in good agreement with offline results and with simulated retina response
- 4D fast track finding system using precise space and time information of the hit. Possibility for fast timing detectors
- Next steps:
  - build a system compatible with large DAQ framework for test with simulated data at 40 MHz and hundreds of tracks per event

# Backup slides



### Feasibility study for LHC experiments

- The Retina architecture is modular, parallel processing units are scalable. Using adequate FPGA resources can cope with high particle rates and large detectors, e.g. 40 MHz event rate and 300 tracks/event of LHC.
- Delivers 3D tracks with offline-like quality at 40 MHz with <1µs latency</p>
- Case study for the LHCb upgrade simulated and documented here: LHCb-PUB-2014-026, JINST 9 C09001 (2014). Affordable resources and cost (50,000 cells ~ 50 FPGA)

Application in forward spectrometer experiment



- ▶ 50 mrad acceptance
  - O(100) particles/event
- ▶ 8 pixel layers
- 2 silicon strip layers
- ► ~0.05 T magnetic field
- Pileup: ~8 pp events

### Track parameters for prototype



## Track receptors





## Track receptors



#### Receptor response



 $\sigma$  = width of the receptor response field.  $\sigma \simeq \Delta~$  grid step

It is much larger than the obtainable resolution on track parameters and has to be tuned.



## Retina algorithm



INFN

36

## Telescope module

- Single-sided silicon sensors:
  - OB2 STM *p*-in-*n* sensor, 10 cmx10 cm active area
  - 512 strips, 183  $\mu$ m pitch, 500  $\mu$ m thickness



## Event display

### Event display for 1 track event: ADC vs strip number



INFN

### Testbeam crew



INFN

## Retina architecture v2



Xilinx Kintex7

### Retina architecture simulation

- ModelSim results on Xilinx Kintex7 FPGA
- Switch+Engine simulated successfully up to 40 MHz input track rate

| <b>≨</b> ⊒∙                                                             | Msgs                                     |                                         |                    |                   |                |                     |                     |                               |               |
|-------------------------------------------------------------------------|------------------------------------------|-----------------------------------------|--------------------|-------------------|----------------|---------------------|---------------------|-------------------------------|---------------|
| ✓ /tb_top_switch/clk                                                    | 0                                        | տուրուղուղո                             | hundhund           | hundhund          |                | տիտովոս             | www.ww              | վորուր                        | лцл           |
| INPUT                                                                   |                                          |                                         |                    |                   |                |                     |                     |                               |               |
|                                                                         | 0                                        |                                         |                    |                   |                |                     |                     |                               |               |
|                                                                         | 111011111001000000011010111011           | 000000000000000000000000000000000000000 | 0 1 10101000001000 | 00110] 100100101  | 000000111      | 11000100010111100   | ] ]1111000010100000 | 0] ]111011111                 | 00100         |
| Switch into                                                             | 00101010                                 | 0000000                                 | 00100100           | 00101011          |                | 00100111            | , 00101101          | ,00101010                     |               |
|                                                                         | 101100110010000011111000101110           | 000000000000000000000000000000000000000 | 1 11010101         | 111101000 1 1010  | 10000010000101 | 1 10000111111100011 |                     |                               | 01100         |
| +/tb_top_switch/in1_group                                               | 01101010                                 | 0000000                                 | 01100100           | 0110              | 1001           | 01100110            | 01101110            | 01101                         | 1010          |
| //b_top_switch/in2_dv                                                   | 1                                        |                                         |                    |                   |                |                     |                     |                               |               |
| +                                                                       | 100 10 1 10 10 1000 1 10000000 1 100 100 | 000000000000000000000000000000000000000 | 0 ()0100           | 00001100001010 )  | 1000010001000  | 0011 ) 010100000    | 1010010 ) 0000100   | )1100111100 (                 | 01110         |
| /tb_top_switch/in2_group                                                | 10001010                                 | 0000000                                 | ()1010             | 00101             | 10100111       | ( )10100100         | (1010111            | ю()                           | 10101         |
| /tb_top_switch/in3_dv                                                   | 0                                        |                                         |                    |                   |                |                     |                     |                               |               |
| +                                                                       | 000110111101001110111101100001           | 000000000000000000000000000000000000000 | <u>) (</u>         | 11001100010110011 | ) (101101000   | 11101011) (100110   | 00000011011)        | 01101111010011 <mark>1</mark> | . ) (0        |
| +                                                                       | 11101110                                 | 0000000                                 | χ                  | 11100101          | 11100101       | 111000              | 010 (11)            | 101110                        | ( 1           |
|                                                                         |                                          |                                         |                    |                   |                |                     |                     | <mark>_</mark> _              |               |
| +                                                                       | 1                                        | 000000000000000000000000000000000000000 |                    |                   |                |                     |                     | <mark>╶┼──┍─┼═</mark> ╏╴      |               |
| - Engine infel                                                          |                                          | 000000000000000000000000000000000000000 |                    |                   |                |                     |                     | ╪══┙┼╌╏                       |               |
|                                                                         | 1                                        |                                         |                    |                   |                |                     |                     | ╡╴┎╪╤                         |               |
|                                                                         | 0000000000000000000                      | 000000000000000000000000000000000000000 |                    |                   |                |                     |                     |                               |               |
| /tb_top_switch/uut/Inst_switch_to_engine/dv_out_0_1_0                   | 1                                        |                                         |                    |                   |                |                     |                     |                               |               |
| /tb_top_switch/uut/Inst_switch_to_engine/value_out_0_1_1                | 000000000000000000                       | 000000000000000000000000000000000000000 |                    |                   |                |                     |                     |                               |               |
| /tb_top_switch/uut/Inst_switch_to_engine/dv_out_0_1_1                   | 1                                        |                                         |                    |                   |                |                     |                     |                               |               |
| + /tb_top_switch/uut/Inst_switch_to_engine/value_out_0_2_0              | 00000010100111111001                     | 000000000000000000000000000000000000000 |                    |                   |                |                     |                     | <u> </u>                      | 0000          |
| /tb_top_switch/uut/Inst_switch_to_engine/dv_out_0_2_0                   | 1                                        |                                         |                    |                   |                |                     |                     | ╧                             | V             |
| +                                                                       | 00001001110010100010                     | 000000000000000000000000                |                    |                   | ╞╴┍══╪┯        |                     |                     | <u>,0.</u>                    | <u> 10000</u> |
| $\sqrt{10}$ ( $\sqrt{10}$ switch/uut/inst_switch to engine/uv_out_0_2_1 | 1                                        | 000000000000000000000000000000000000000 |                    |                   |                |                     |                     |                               | 10000         |
| /// to switch/uut/Inst_switch to engine/vulce_out_0_3_0                 | 1                                        |                                         |                    |                   |                |                     |                     |                               | ,0000         |
| -4 /tb top switch/uut/Inst switch to engine/value out 0.3.1             | 00010001110101111000                     | 000000000000000000000000000000000000000 |                    |                   |                |                     |                     | 1 10.                         | 0000          |
| // tb_top_switch/uut/Inst_switch_to_engine/dv_out_0_3_1                 | 1                                        |                                         |                    |                   |                |                     |                     |                               |               |
| /tb_top_switch/uut/Inst_switch_to_engine/value_out_0_4_0                | 00000000000000000000000                  | 000000000000000000000000000000000000000 |                    |                   |                |                     |                     |                               |               |
| /tb_top_switch/uut/Inst_switch_to_engine/dv_out_0_4_0                   | 1                                        |                                         |                    |                   |                |                     |                     |                               |               |
| ≗≣⊛ Now                                                                 | 33000000 ps                              | 80000 ps 1000                           | 100 ps120(         | 000 ps 1400       | 00 ps          | 160000 ps 1         | 80000 ps20(         |                               | 220000 ps     |
| € / O Cursor 1                                                          | 213445 ps                                |                                         |                    |                   |                |                     |                     | 213445                        | ps            |

#### Latency of Retina response

SWITCH 14 t.u.

ENGINES 12 t.u.

TRACK FITTER 30 t.u.

Clock frequency 400 MHz, i.e. 1 t.u. = 2.5 ns





## Retina response to tracks

The retina algorithm implemented in hardware is working properly

Weight distribution from high level C++ simulation for retina cellular units

Weight distribution from ModelSim (Xilinx Kintex7 FPGA)



INFN

42

# Delivering the data

- Engines receives data, through the switch, from all the tracking detectors
- Divide the grid in 4 regions corresponding to the number of available FPGAs (4 Xilinx Kintex7) for the processing engines.



$$x_{+} = -x_{-}\frac{z - z_{+}}{z_{-}} + x$$

 Engines with non negligible weights belong to different regions of the grid

$$|x - x_{+} - x_{-} \frac{z - z_{+}}{z_{-}}| < 2\sigma$$

- Deliver the data to the engines (in different FPGAs) using a *full mesh switch*
- z determines the slope and x the intercept with x<sub>+</sub> axis of a cluster in the (x<sub>+</sub>, x<sub>-</sub>) plane
- Data path is determined by the cluster coordinates (x,z) using 8 bit information: 5 bit for x and 3 bit for z

## Switch modules

1<sup>st</sup> switch 32x4

16 analog inputs from each sensor.



4 output ports: 1 output to 2<sup>nd</sup> switch level and 3 output to the other DAQ boards 2<sup>nd</sup> switch 4x16

4 input ports: 1 from each DAQ board



16 output ports, each connected to 16 engines in parallel



### 2-way sorter

