# An Analog Neural Network ASIC for Image Reconstruction Embedded in Detectors

S. Di Giacomo, M. Ronchi, G. Borghi, M. Carminati, C. Fiorini



marco1.carminati@polimi.it







### Outline

#### Motivation

- Edge Computing
  - Conventional vs. In-Memory
- Analog In-Memory Computing for Neural Network (NN)
- Neural Network for event reconstruction

#### ANNA ASIC

- Neural Network training and quantization
- Analog implementation as capacitive crossbar array
- Circuit challenges and non-idealities
- Energy Efficiency



### **Edge Computing**

#### **Conventional Von Neumann Computing**

- Power hungry data movement
- Long memory access latency
- Limited memory bandwidth



#### In-Memory Computing (IMC)

- Parallel data processing
- Negligible data movement
- Operations inside memory elements
- Energy efficient
- Programmable





# Analog IMC for Neural Network inference

NN basic operation  $\rightarrow$  Multiply-and-Accumulate (MAC)

**Digital** approach:

- Intensive and constant dataflow
- Pipeline multiple full-adders

#### Analog accelerator exploits Ohm's law and Kirchhoff's laws:

- Multiplication: non-volatile memories, used also for weight storage
- Accumulation: current or charge summation on a wire
- ✓ Low power
- ✓ Throughput and speed improvements (high parallelism)
- $\checkmark$  Monolithic integration with CMOS IC



#### **MAC** operation









### **Neural Network for Event Reconstruction**



#### **ANNA** application

NN for the localization of the radiation event from detector signals for nuclear medical imaging (e.g. Anger Camera for PET).

- → Crossbar array of **programmable switched capacitors**
- → Analog operations performed directly on analog signals coming from photodetectors
- $\rightarrow$  No need for signal ADC and FPGA for embedded processing
- $\rightarrow$  Interaction coordinates directly at the output of the ASIC



V<sub>out,1</sub>

### NN software training and performance

• Simulated dataset for training

- NN with 64 inputs, 2 hidden layers of 20 neurons each, 2 outputs
- Training in MATLAB with weights quantization (5-bit resolution)



| Resolution [mm]  | x    | У    | Total | -      |
|------------------|------|------|-------|--------|
| FWHM (2D PSF)    | 1.58 | 1.81 |       | _      |
| r <sub>50%</sub> | 0.79 | 0.79 | 1.84  | y [mm] |
| r <sub>90%</sub> | 2.44 | 2.44 | 4.83  |        |
| MAE              | 1.16 | 1.13 | 1.80  | _      |
|                  |      |      |       |        |



- **FWHM**: of the 2D PSF of the reconstruction error
- r<sub>50%</sub> and r<sub>90%</sub>: 50% and 90% percentile of the normalized error distribution
- MAE: mean absolute error



### **Analog Neural Network implementation**



POLITECNICO MILANO 1863

### **Analog Neuron operation**

#### **Charge-redistribution approach**

- 1) Charge only C with V<sub>in</sub> to minimize energy consumption ( $E = CV^2$ )
- 2) Charge is **redistributed** among all capacitors closing S<sub>s</sub>
- **3)** Only the capacitors corresponding to the desired weight are connected to the integrator, while the others are disconnected by means of their respective switch.

The output is 
$$V_{out,j} = rac{V_{in,i}}{15} rac{C_{ij}}{C_F}$$



### **Analog Neuron Layout**



140.7 μm





- Weight programming by means of **SPI**
- Stored data Q, plus additional logic, to close/open capacitor bank **switches**
- Timing signals provided by programmable
   Ring Counters





#### **Analog Switches**

- Charge injection and clock feedthrough
- Parasitic capacitances

#### Integrator

- Large dynamic range (0 3.3 V), low power
- Very low offset error
- Stability for different input capacitance (NN weight)
- Electronic noise



# Analog Switches (1/2)

- Analog switches implemented as transmission gates (TG)
- Two non-idealities:
  - 1) Charge injection: added or subtracted charge from drain/source in an asymmetric way, depending on impedance

 $Q_{ch} = WLC_{ox}(V_{DD} - V_{in} - V_{th})$ 

- 2) Clock feedthrough: added charge due to gatesource and gate-drain capacitances
- Solution
  - **Dummy switches** to absorb charge from TG
  - Cadence optimization tool to set widths and lengths that minimize injected charge





Dummy switches





# Analog Switches (2/2)

- Additional parasitic capacitors added by switches
- Errors during input sampling and charge redistribution phases
- Estimated from **post-layout** simulations

A corrective factor K can be calculated and added to neural  $\geq$ network Matlab model, to take it into account during training

$$V_{\text{out,ideal}} = \frac{V_{\text{in}}}{15} \frac{C_{\text{ij}}}{C_{\text{F}}}$$

$$V_{\text{out,real}} = \frac{V_{\text{in}}}{15} \frac{C_{\text{ij}}}{C_{\text{F}}} \left( \frac{C_{\text{LSB}} + C_{\text{p,top}}}{C_{\text{LSB}} + C_{\text{p,top}}/15 + C_{\text{p,bottom}}/15} \right) \qquad V_{\text{out,real}} = K$$

$$K$$



Vin

# Integrator (1/3)

Low power, rail-to-rail class A amplifier

#### Offset error

• Offset at integrator differential input subject to a huge amplification

$$V_{out} = \frac{V_{in}}{15} \frac{C_{ij}}{C_F} + V_{off} \left( 1 + \frac{|C_{ij}|}{C_F} \right) \quad \text{with } V_{off} = 96 \ \mu V \pm 5 \text{mV}$$



Offset compensation phase to minimize its effect

- 1) The offset is sampled on a 1.5 pF capacitor
- 2) The capacitor is flipped and the offset subtracted from V+







MILANO 1863

### Integrator (2/3)

#### Variable input capacitance

• Variable input capacitance  $C_{ij}$  (from 0 to 97.5 pF)  $\rightarrow$  large variability of feedback factor

$$\beta = \frac{C_F}{C_{ij} + C_F} \qquad \qquad \frac{1}{\beta} \rightarrow \text{from 3.5 dB to 54 dB}$$

- > Programmable Miller compensation to always ensure fast and stable response
  - Can be chosen among four capacitances or a combination of them (25 fF, 50 fF, 100 fF, 400 fF)
  - Capacitance settings stored in SRAM cells
  - Target phase margin of  $67^{\circ} \pm 7^{\circ}$



# Integrator (3/3)

#### Electronic noise

Transient noise simulations on a single-input neuron to estimate noise contributions to integrator output:



 $\sigma_{V_{out,1}1} = \sqrt{64} * 595 \mu V \approx 4.7 mV \text{ (max 64 inputs)}$ 



 $\sigma_{V_{out,kTC}} \approx 440 \ \mu V \rightarrow \mathbf{kTC}$  noise contribution  $\sigma_{V_{out,OPA}} \approx 400 \ \mu V \rightarrow \mathbf{op}$ -amp only noise contribution

Matlab simulation: given set of input signals and noise σ = 5 mV at the output of each neuron, to consider contribution of all inputs

Std. of the **predicted coordinates** to evaluate effect of **noise** on the **NN performances** 

### **Energy Efficiency**

- Energy consumption during NN **inference** at f<sub>clock</sub> = 10 MHz, estimated from post-layout simulations
- Input buffers and integrators are powered on only when needed to save energy



| Fig. of merit            | Estimate     |  |
|--------------------------|--------------|--|
| I/O latency              | 4.6 µs       |  |
| I/O total operations     | 3566         |  |
| Total consumption        | 38.12 nJ     |  |
| Efficiency               | 775.21 MOP/s |  |
| Analog energy efficiency | 135.74 GOP/J |  |
|                          |              |  |
| Total energy efficiency  | 93.55 GOP/J  |  |



#### **Full Neural Network ASIC Simulation**

- Preliminary full neural network ASIC schematic simulation in CADENCE
- Input event (64 SiPM voltage signals) with position of interaction (0mm, 5mm)
- The two output voltages represents the predicted x, y coordinates



TECNICO

MILANO 1863

#### Conclusions

- ✓ Fully analog neural network able to perform 5-bit MAC operations in the charge domain (ASIC)
- $\checkmark\,$  Can be monolithically **integrated** in the front-end ASIC
- ✓ Inference in a more efficient way in terms of computational cost and energy, compared to a fullydigital implementation
- ✓ ASIC prototype with  $C_{min} = 100 fF$  in CMOS 0.35 µm node
- ✓ Energy consumption estimated for inference  $\approx$  **38.12 nJ**
- ✓ Energy Efficiency estimated for inference ≈ 93.55 GOP/J
- Will be submitted soon for fabrication.
- Can be applied to several detector challenges where NN are adopted (charge sharing correction etc...)





DIPARTIMENTO DI ELETTRONICA INFORMAZIONE E BIOINGEGNERIA





Thank you!