



# Task 7.4.1 A 4-channel electronic board for Cluster Counting

F. Grancagnolo INFN-Lecce



This project has received funding from the European Union's Horizon 2020 Research and Innovation programme under Grant Agreement No 101004761.

#### **Cluster Counting pills**



**Single out**, in every recorded detector signal, the **isolated structures** related to the arrival on the anode wire of the **electrons belonging to a single ionization act**.



**Determine.** in the signal. the **ordered sequence of the electrons arrival** times:  $\{t_j^{el}\}$   $j = 1, n_{el}$ 

Based on the dependence of the average time separation between consecutive clusters and on the time spread due to diffusion, as a function of the drift time, define the probability function, that the  $j^{th}$ 

electron belongs to the  $i^{th}$  cluster: P(j,i)

$$(i,i) \quad j = 1, n_{el}, \ i = 1, n_{cl}$$

from this derive the most probable time ordered sequence of the original ionization clusters:  $\{t_i^{cl}\}$   $i = 1, n_{cl}$ 

For any given first cluster (FC) drift time, the **cluster timing technique** exploits the time distribution of all successive clusters to determine, by using statistical (MPS) or ML techniques, hit by hit, the most probable **impact parameter**, thus reducing the **bias** and improving the average **spatial resolution** with respect to that obtainable with the FC method alone:

over a 1 cm drift cell, spatial resolution may improve by  $\gtrsim 20\%$ down to  $\lesssim 80 \ \mu m$ .

Fringe benefits of the cluster timing technique are:

- event time stamping (at the level of ≈ 1 ns);
- improvements on charge division;
- Improvements on left-right time difference.

25/04/23

#### Example: finding electron peaks in beam test data



25/04/23

## Cluster Counting/Timing (CC/T) strategy

- The objective of the project is to be able to implement, on a single FPGA, CC/T algorithms (with high peak finding efficiency) for the parallel preprocessing of as many channels as possible (4, at this stage) to reduce cost and complexity of the system and to gain flexibility in determining the proximity correlations between hit cells for track segment finding and for triggering purposes. Furthermore, applying the CC/T technique in real time has the additional advantage of reducing the amount of data transfer and of data storage (the data generated by the digitization at high speed of drift chamber signals can reach transfer rates up to a few TB/s).
- An effective approach to data reduction consists in transferring, in real time and for each hit drift cell, only the minimal information relevant to the application of the cluster counting/timing techniques: the amplitude and the arrival time of each peak associated with each individual ionization electron.
- This can be accomplished by using a **FPGA for the real time analysis** of the data generated by the drift chamber and successively digitized by an ADC.
- There are three possible CC/T algorithms already implemented or yet to be implemented:
  - **Derivative** (implemented): It is based on the first and second derivative of the digitized signal function
  - **RTA** (under development): It is based on a bin-by-bin difference of the waveform with a normalized search template.
  - RNN / CNN (to be implemented): Long Short-Term Memory (LSTM), and Recurrent Neutral Network (RNN) for peek finder

#### Derivative algorithm

 A first simple tested algorithm of peak finder is based on the first and second derivative of the digitized signal function *f*, is defined for each time bin i, Δb being the number of bins over which the average value of f is calculated:

$$f'(i) = \frac{f(i) - \bar{f}(i - \Delta b)}{\Delta b} \quad f''(i) = f'(i) - f'(i - 1)$$

• A peak is found when  $\Delta f, f'$  and f'' are above pre-defined threshold levels.



### **Recursive Template Algorithm**

- Optimization in progress to compare performance with respect Ο to the derivative algorithm.
- Both algorithms are quite elementary from the computational 0 point of view for implementation in FPGA's with few parameters to be optimized.







25/04/23

7.4.1 - F.Grancagnolo

## Algorithm implementation





- To implement a board with **multiple channels** readout, three different approaches are currently being tried:
  - TEXAS INSTRUMENT ADC32RF45
  - NALU SCIENTIFIC ASoCv3 and HDSoC
  - CAEN digitizer VX2751
- First task (in particular, for the first and second case) is implementing the **data transfer to the DAQ system** to take advantage of the 10Gbit/s standard:
  - optical fiber with SFP + connectors
  - SFP + to RJ45 adapters
- Second task is finding the most efficient way to **store the elaborated information** before the data transfer.

25/04/23

#### KCU105 + ADC ADC32RF45



- The code for the single channel in the old framework has been translated to the new Xilinx framework
- The communication with the new ADC has been carried out and simulations of timing and power consumption have been completed
- The code for using the 10Gb standard SFP + connections has been developed (next slide)
- $\odot$  The integration between the CC/T algorithm block needs to be terminated



25/04/23

#### KCU105 + ADC ADC32RF45: 10G SFP+

- A 1Gb network is sufficient for a single ADC but, in some cases, one has packet losses or corruption: during the tests some synchronization problems on the second channel of the ADC have been experienced
- To speed up communication with DAQ and PC, a data communication through the SFP + ports has been developed with a 10Gb speed via fiber or copper (a small local network is being set up)
- $\,\circ\,$  The following block schematics is being implemented in the FPGA :



## Naluscientifc ASoCV3

- We have recently received from **Naluscientific** a 4-ch **ASoCv3** evaluation board, which is currently being tested with a pulse generator
- Next task is to connect the ASoCv3 board to a 4-ch amplifier (LMH6522) to perform tests with drift tubes
- Currently we cannot implement the algorithm directly because we do not have access to the source code (we are in strict contact with Naluscientific) and use their acquisition software (**Naluscope**)



10

#### LMH6522





- So far single or dual channels amplifiers have been used. For a DC with several tens of thousand channels, the amplifier PCBs need considerable space close to the detector, therefore, the distribution of power and of cooling may impose serious constarints.
- O The basic idea is to use a single chip to drive one or more multi-channel digitizers

#### A possible choice: LMH6522

- The LMH6522 contains 4, high performance, digitally controlled variable gain amplifiers (DVGA).
- O The gain is digitally controlled over a 31 dB range.
- O Gain Step Accuracy: 0.2 dB
- O Disable Function for Each Channel
- O 1,5 GHz Bandwidth
- O -65 dBc cross talk between adjacent channels
- O Low Power Mode for Power Management Flexibility (about 400mA)

25/04/23

| Na | luscier | tifc | HDSoC |
|----|---------|------|-------|
|    | USCICI  |      |       |

| Project  | Sampling<br>Frequency<br>(GHz) | input<br>BW<br>(GHz) | Buffer<br>Length<br>(Samples) | Number of<br>Channels | Timing<br>Resolution<br>(ps) | Available<br>Date | HDSoC           |             |
|----------|--------------------------------|----------------------|-------------------------------|-----------------------|------------------------------|-------------------|-----------------|-------------|
|          |                                |                      |                               |                       |                              |                   | Parameter       | Spec        |
|          |                                |                      |                               |                       |                              |                   | Sampling Rate   | 1-2 GSa/s   |
| ASoC     | 3-5                            | 0.8                  | 16k                           | 4                     | 35                           | Rev 3 avail       | ABW             | > 600MHz    |
|          | 00                             | 0.0                  | 2011                          | •                     |                              |                   | Depth           | 2k Sa       |
| HDSoC    | 1-3                            | 0.6                  | 2k                            | 64                    | 80-120                       | May'21            | Trigger Buffer  | ~3 us*      |
| IID SOC  | 13                             | 0.0                  | 21                            | 0-1                   | 00 120                       |                   | Deadtime        | O**         |
| AARDVARC | 8-14                           | 2.5                  | 32k                           | 4                     | 4                            | Rev 3 avail       | Channels        | 64          |
|          | 0-14                           | 2.5                  | JZK                           | 7                     | 7                            |                   | Supply/Range    | 2.5         |
| AODS     | 1-2                            | 1                    | 16k                           | 1-4                   | 100-200                      | Rev 1 avail       | ADC bits        | 12          |
| AUDJ     | 1-2                            | -                    | TOK                           | 7-4                   | 100-200                      |                   | Timing accuracy | 80-120ps    |
| STRAWZ 5 | 5                              | 2                    | 2k                            | 64                    | 10                           | Dec'22            | Technology      | 250 nm CMOS |
|          | 5                              |                      |                               |                       |                              |                   | Power           | TBD         |

• We expect a **HDSoC evaluation board** within this month, thanks to CAEN Technologies US branch (Naluscientific distributor)

 CAEN will also send us a FERS card and, most likely, the IP core to be able to implement our algorithms

25/04/23

#### Caen VX2740

- We have been provided by CAEN with the VX2740 digitizer (a lower performance version of the VX2751, better suited for CC/T and still under development and test)
- We are becoming familiar with the openFPGA SCICompiler software (released a few days ago) and we are using a trial license with a timebomb, not allowing us to do full development.

#### $\odot$ We are waiting for CAEN to have a full license



25/04/23

## Implementing ML algorithms (for peak finding) on FPGA



The first step required for the implementation of **neural networks** on FPGAs is the conversion of the high-level code used for the creation of the network (**QKeras**) into a **High-Level Synthesis** (HLS) To accomplish this task, the **hls4ml** package will used.

A schematic workflow of hls4ml is illustrated in the figures.

- 1. The red section indicates the usual software steps required to design a neural network for a specific task.
- 2. The blue section of the workflow is the task done by hls4ml, which translates the model into an HLS project that can be synthesized and implemented to run on an FPGA.

7.4.1 - F.Grancagnolo

<sup>25/04/23</sup> 

## Milestones and deliverables

#### Milestones

- month 36: 4-ch system board (ADC + FPGA) engineered (support from CAEN)
- month 46: board completed

#### Deliverables

• month 46: board tested under particles beams (written report)

Despite the various delays experienced (COVID-19 pandemic, cancellation of Russian support), we think that this schedule can reasonably be met.

25/04/23