# 4D fast tracking for LHCb U2

M. Citterio, L. Frontini, P. Gandini, V. Liberali, **N. Neri**, M. Petruzzo, S. Riboldi, A. Stabile



UNIVERSITÀ DEGLI STUDI DI MILANO



Università and INFN Milano

Milano, 16 March 2022



1

### Outline

- LHCb upgrade II
- Timespot R&D
- 4D fast tracking algorithm
- Performance
- Implementation in FPGA
- R&D and plans for the future
- Summary



#### LHCb Upgrade II

- Achieve ultimate precision on heavy flavour physics with LHCb U2
- Inst. lumi 1.5×10<sup>34</sup>
  cm<sup>-2</sup>s<sup>-1</sup> in Run5 (×7.5 increase wrt U1)
- Data sample 300 fb<sup>-1</sup> by the end of Run 6
- LHCb U2 detector installation in LS4 (2033-2035)





# LHCb Velo U2 studies

- Time resolution for hits in the VELO driven by tracking and PV reco
- Time spread among PVs is about 180 ps. Time info helpful to reduce hit combinations for track reco
- Nominal requirement for single hit time resolution 50 ps



ÍNFŃ

### R&D projects at INFN



TIMEPOST R&D project -TIME & SPace real-time Operating Tracker -Three years project, from 2018, funded by INFN -Development of a silicon and diamond 3D tracker with fast timing: demonstrated 20 ps hit resolution at testbeam at PSI -Construction of a demonstrator integrating sensors, front-end electronics, real-time processors



INSTANT project phase-1 funded by ATTRACT programme -Imaging iN Space-Time ANd Tracking -Development of a compact imaging and particle-tracking device -Application in HEP, medical imaging, fast neutron source monitoring



#### Nicola Neri

ÍNFŃ

#### 4D Fast Tracking for VELO U2

0.1 0.15 0.2 0.25

 $t_{si} - < t_{MCP-PMT} > [ns]$ 

#### Timespot sensors

- 3D sensors: 55 µm × 55 µm
  pixel and 150 µm thickness
- Short drift distance of charge carriers: excellent radiation hardness and fast signals
- 3D-trench geometry for uniform electric field
- 20 ps time resolution measured on charged pion beam at PSI

NIM, A 981 (2020) 164491 L. Anderlini *et al* 2020 *JINST* **15** P09029 *JINST* 16 (2021) 09, P09028



0

0.05

-0.25 -0.2 -0.15 -0.1 -0.05

Nuclear Inst. and Methods in Physics Research, A 981 (2020) 164491

# Timespot ASIC

- Timespot1 ASIC: 28-nm CMOS technology, 32×32 pixel matrix, 55 µm pitch
- First standalone ASIC tests are very encouraging: TDC average resolution 23 ps, analog front-end average resolution 43 ps. Tests with sensor and particle generated signals are ongoing
- Possibility to improve performance with minor corrections to the design



#### arXiv:2201.13138, TWEPP2021

Timespot1 ASIC 2.6 mm × 2.3 mm



Timespot1 PBC board 8 cm × 12 cm



#### Nicola Neri

INFN

# Timespot ASIC

- Timespot1 ASIC: 28-nm CMOS technology, 32×32 pixel matrix, 55 µm pitch
- First standalone ASIC tests are very encouraging: TDC average resolution 23 ps, analog front-end average resolution 43 ps. Tests with sensor and particle generated signals are ongoing
- Possibility to improve performance with minor corrections to the design



#### arXiv:2201.13138, TWEPP2021



Nicola Neri

INFN

4D Fast Tracking for VELO U2

### 4D fast tracking studies

- Objectives:
  - Fast Tracking at HL-LHC 40MHz event rate
  - Simulate a LHCb VELO sector
  - FPGA processing: clustering, parallel track reconstruction
  - Performance studies: latency, FPGA usage, max. evt. rate, tracking efficiency, resolution
  - Optimisation of detector geometry



#### Fast tracking in FPGA for U2



- Both tracks and collection of hits can be provided to the HLT
- Reduce HLT data processing since early stages, e.g. analyse
  1 interesting pp collision only per event

INFŃ

10

# 4D fast tracking

- Parallel tracking algorithm implemented in FPGA
- Highly parallel and pipelined architecture
- Track identified from cluster of stubs on reference plane





# 4D fast tracking

- Detector with embedded 4D tracking capabilities
- Stub based approach for track reconstruction: no assumption on particle origin





- Stubs with **timing** 
  - particle velocity compatible with speed of light
  - using timing for hit combinatoric and fake stub suppression



#### Stub based fast tracking approach



Stub velocity 
$$v_{21} = \frac{|\vec{x}_2 - \vec{x}_1|}{t_2 - t_1}$$
 used to filter stub candidates  
CERN-LHCb-PUB-026

• Stubs are projected at reference plane  $z_+$  and cellular units at  $(x_+, y_+)$  evaluate a Gaussian response according to their distance wrt stub projections

#### Stub based fast tracking approach



Gaussian response to a single stub evaluated by engines

$$\begin{split} W_{ijk} &= N_{ijk} \cdot \exp\left(-\frac{s_{ijk}^2}{2\sigma^2}\right), \text{ with } N_{ijk} = \begin{cases} 1 & |s_{ijk}| \leq 2\Delta \\ 0 & \text{otherwise} \end{cases} \\ s_{ijk}^2 &= (x_{k+} - x_{i+})^2 + (y_{k+} - y_{j+})^2 \text{ the squared distance, and } \Delta, \sigma \text{ to be adjusted for optimal response} \end{split}$$

Weight function

$$W_{ij} = \frac{1}{N_{ij}} \sum_{k} W_{ijk}$$
, where  $N_{ij} = \sum_{k} N_{ijk}$  if  $N_{ij} > thr$  a track candidate is identified

#### Stub based fast tracking approach



Track parameters via Gaussian interpolation

$$\begin{split} x_{+,\mathrm{rec}} &= x_{+ij} + \frac{\Delta_{x_+}}{2} \frac{\ln(W_{i-1j}/W_{ij}) - \ln(W_{i+1j}/W_{ij})}{\ln(W_{i-1j}/W_{ij}) + \ln(W_{i+1j}/W_{ij})} \\ y_{+,\mathrm{rec}} &= y_{+ij} + \frac{\Delta_{y_+}}{2} \frac{\ln(W_{ij-1}/W_{ij}) - \ln(W_{ij+1}/W_{ij})}{\ln(W_{ij-1}/W_{ij}) + \ln(W_{ij+1}/W_{ij})}, \end{split}$$

$$\begin{aligned} x_{-ij} &= \frac{1}{N_{ij}} \sum_{k} x_{-ijk} \\ y_{-ij} &= \frac{1}{N_{ij}} \sum_{k} y_{-ijk} \\ t_{0ij} &= \frac{1}{N_{ij}} \sum_{k} t_{0ijk} , \end{aligned}$$

INFN

### Simulations

- Simple case: 12 layer VELO-like detector
- ► At lumi=10<sup>34</sup>cm<sup>-2</sup>s<sup>-1</sup>: pileup~40 and ~1200 tracks/event
- Sensor area =  $6x6cm^2$ , pixel size =  $55x55\mu m^2$ , thickness =  $200 \ \mu m$ , time res  $\sigma_t$ =30 ps
- 360 000 cellular units for track identification



### Efficiency

**Track reco efficiency** vs track parameters  $r_+$ ,  $\varphi$ ,  $d_0$ ,  $z_0$ ,  $\eta$ :

- -track efficiency: ~98%
- -track purity: 82% with 1200 tracks per event, using timing information

-track purity: 60 % without timing



### Performance vs timing

- Track **purity** improves with hit time resolution, i.e. 30 ps hit resolution very effective
- Track parameter resolution significantly improves with time resolution (backup slides)





### Hardware implementation

#### Xilinx VC709 Evaluation board:



**PCIe DMA**, implemented using Nikhef **WUPPER** : up to 60 Gbps data transfer rate [https://redmine.nikhef.nl/et/projects/wupper]

**Optical links** based on GTH transceivers: 4 (up to 12) bidirectional links at **12.8 Gbps** 

**DDR3 RAM**: 2x 4 GB banks, 100 Gbps max.read/write rate (per bank)

#### **Stub Constructor on VC709**

#### custom board



#### 2 Xilinx Virtex Ultrascale FPGAs

high-speed optical transceivers → ~1.6 Tbps input data rate

one Xilinx Zynq FPGA

Switch + Engines + Fan In

#### 4D tracking device architecture

- Main modules :
  - stub constructor
  - Switch
  - Engines
- Hold Logic
  - prevent data
    losses when one
    module is busy



### 4D tracking demonstrator



- Used simulated data of 1/64 sector of the VELO-like detector: reconstruct stubs -> tracks. Evaluate rate, efficiency, latency
- System is modular and scalable: results can be extend e to full detector

#### Test results



- Fast tracking test performed using gFEX board with 320 MHz clock
  - data generator
  - data checker
- Obtained event processing rate
  40.9 MHz
- Initial stub maker test on VC709 with 240 MHz clock
  - achieved 12 MHz event rate.
    Possibility to further parallelise the processing and achieve 40 MHz (e.g. increase number of Stub Constructors)

# Summary and next steps

- A 4D fast tracking device for VELO-U2, simulated and implemented in FPGA, has been presented. Preliminary results are encouraging and show the viability of the solution
- Next steps:
  - improve **simulations** and performance studies
  - development of **clustering** algorithm in FPGA
  - upgrade from VC709 to Xilinx VCU128 **board** (1 Tb/s data rate)
  - complete the **demonstrator** test in laboratory using simulated data
  - organise a **testbed** using LHCb VELO data in collaboration with online group, possibly during Run3

# Backup slides



#### Track parameter definitions



#### Resolution on track parameters



- Events with 40 pp interactions, 1200 tracks per event
- ► 90'000 cellular units

Time

#### Stub constructor

Stub constructor module completed and tested



27

#### Stub constructor test

In order to process events at 40 MHz rate in a data\_in(0) data\_in(1)
 VELO Upgrade II -like configuration, the following equation need to be satisfied:

$$\frac{1}{f} * Occ * N < \frac{1}{40MHz}$$

where:

- **f** = **200 MHz** is the processing clock,
- <Occ.> = 50% is the fraction of Stub Makers processing data,
- <N> = 40 is the number of identified pre-stubs from each Stub Maker
- With this number the relation is not satisfied, and we obtain a processing rate of 10 MHz
  - a simple way to satisfy the relation is to increase the number of combinatoric processes within the Stub Makers by a factor 4 (see fig.)



- Comment:
  - the estimation is based on a processing clock f=200MHz, this value could also be increased to enhance the acceptable event rate