## Current and future ASICs for Precision Timing Measurement















Pardis Niknezhadi, Peter Orel, Gary S. Varner

University of Hawai'i Prague ps Timing Workshop June 9, 2015





### The Future is ...

### 1GHz analog bandwidth, 5GSa/s



Time Difference Dependence on Signal-Noise Ratio (SNR)

G. Varner and L. Ruckman NIM A602 (2009) 438-445.

### Simulation includes detector response



J-F Genat, G. Varner, F. Tang, H. Frisch NIM A607 (2009) 387-393.

### Now: high space-time Resolution



In a number of communities (future particle/astroparticle detectors, PET medical imaging, etc.) a growing interest in detectors capable of operating at the pico-second resolution and  $\mu$ m spatial resolution limit (for light 1 ps = 300  $\mu$ m) signal electrodes



**Front-End Electronics** 

Fast signal collection x-ray detectors

### **Toward increased timing precision**

| ASIC       | # chan | Depth/chan | Time Resolution [ps] | Vendor | Size [nm] | Year   |
|------------|--------|------------|----------------------|--------|-----------|--------|
| LABRADOR 3 | 8      | 260        | 16                   | TSMC   | 250       | 2005   |
| BLAB       | 1      | 65536      | 1-4                  | TSMC   | 250       | 2009   |
| STURM2     | 8      | 4x8        | <10 (3GHz ABW)       | TSMC   | 250       | 2010   |
| DRS4       | 8      | 1024       | ~1 (short baseline)  | IBM    | 250       | 2014   |
| PSEC4      | 6      | 256        | ~1 (short baseline)  | IBM    | 130       | 2014   |
| RITC3      | 3      | Continuous | TBD                  | IBM    | 130       |        |
| PSEC5      | 4      | 32768      | TBD                  | TSMC   | 130       |        |
| DRS5       | 8/16?  | 128x32     | TBD                  | UMC    | 110       |        |
| SamPic     | 16     | 64         | ~3 [pic 0]           | AMS    | 180       | [2014] |
| RFpix      | 128?   | TBD        | <= 100fs (target)    | TSMC   | 45 ?      |        |

### Waveform sampling ASICs (not just good looking)





|                   | WFS ASIC     | Commercial |
|-------------------|--------------|------------|
| Sampling<br>speed | 0.1-15 GSa/s | 3 GSa/s    |
| Bits/ENOBs        | 16/9-13+     | 8/7.4      |
| Power/Chan.       | <= 0.05W     | Few W      |
| Cost/Ch.          | < \$10 (vol) | > 100\$    |



## Another gaze into the Crystal Ball



## Exploration of the space-time limit

-Sampling at high sampling rate and high bandwidth -Resolve small distances

Current Goals: Spatial resolution of  $10\mu m$  in z and  $20\mu m$  in  $r\phi$ In Silicon  $10\mu m$  in z corresponds to timing resolution of about 100 fs $20\mu m$  in  $r\phi$  will depend on the SNR





Pixel detector (PDX) at SuperKEKB

### Visualizing parameters for time resolution in z



### WFS ASIC: Basic Functional components



## **Intrinsic Limitation**

No power (performance savings) for continuous digitization

We aren't going to put Analog Devices out of business



"down conversion" → For most "triggered" 'event' applications, not a serious drawback

### **Constraint 1: Analog Bandwidth**

Difficult to couple in Large BW (C is deadly)



### Constraint 2: kTC Noise

### Want small storage C, but...



### Constraint 3: Leakage Current

### Increase C or reduce conversion time << 1mV



Sample channel-channel variation  $\sim$  fA  $\rightarrow$  nA leakage (250nm  $\rightarrow$  130nm)

# Target Specifications (design study goals)

| Parameter                                                  | Minimum desired value           |
|------------------------------------------------------------|---------------------------------|
| Sampling frequency (ASIC)                                  | 20 GHz                          |
| Bandwidth (Detector and ASIC)                              | 3 GHz                           |
| Signal to Noise Ratio (Detector and ASIC)                  | 58dB (V <sub>pp</sub> =1 volts) |
| Velocity of Propagation (Transmission Line/<br>strip line) | 0.35c                           |
| Number of Bits of Resolution                               | 9.4 bit                         |

### This is an ongoing study – will show where we are

Take the PSEC4 design as a reference



## **PSEC4:** Sampling Analysis

### Utilizing PSEC4's SCA as starting place -Adjustable Sampling rate between 4-15 GSPS -1.6 GHz bandwidth



16

## Equivalent Circuit



# Simulation Results: Bandwidth for worst case operating bias point

## Whether the 1<sup>st</sup> switch is on or the last, Gain is the same



## Simulation Results: Group Delay

Group Delay does vary depending which switch is on by ~25ps which puts a constraint on sampling time window



## Simulation Results: Phase

• At higher frequencies Phase vs freq behavior is also different and depends on which switch is on



## Simulation Results: Capacitance

## Capacitance is 2.2 pF and does not dependent on which switch is on



### **PSEC4 Analysis: Single Sampling Cell**



### **PSEC4 Analysis: Single Sampling Cell**

### **Structure & Layout**



#### Top view

#### Side view



## **Single Sampling Cell Coupling**



- Driver circuit
- Switch with n-p FET pair
- Sampling capacitor
- Comparator as load





- Check Csampling capacitance
- Identify Ron and Roff

### **Sampling Capacitor Spread**



Monte Carlo with process variation and mismatches shows a discrepancy between Csampling Schematic (13.5 fF) and Measured mean (20.27 fF).

The Spread is about 1.9fF which makes the Capacitor tolerance at about 9.3%

| Num. of<br>Samp. | MEAN     | STD     | MIN      | ΜΑΧ      |
|------------------|----------|---------|----------|----------|
| 1000             | 20.27 fF | 1.89 fF | 14.86 fF | 26.24 fF |

### Pass Transistor (Switch) Resistance



• Ron=2.4k @665mVdc

• Roff is in GΩ

• The PFET and NFET are not matched and Ron varies considerably

### **Frequency Analysis**

### **Performance: S(Z)-parameter**



The input impedance is high and it is capacitive.

## Input coupling analysis



The transfer function parts:

- input parasitic capacitance of the transistor plus capacitance of the transmission line section.
- Series resistance of the transistor channel (Rds)
- Output capacitance which is formed of the parasitic capacitance of the transistor, sampling capacitor and load capacitance



| Capacitance | Value [fF] |
|-------------|------------|
| Cin_open    | 8fF        |
| Csw_out     | 10fF       |
| Csamp       | 20.3fF     |
| Cload       | 13fF       |

## Small signal frequency response

**Bandwidth** 20 LowZ ideal 18 LowZ par LowZ load&par 16 50Z ideal 50Z par 14 50Z load&par 12 Bandwidth [GHz] 10 8 6 4 X: 0.65 Y: 1.688 2 0 0.2 0.6 0.8 1.2 0 0.4 1 Vdc [V]

- BWworst≈2.3GHz @665mVdc @LowZ drive
- BWworst≈1.7GHz @665mVdc
   @50Ω drive



• Isolation is over 60dB over all parameter space

## Small signal phase analysis



Group Delay with the load



### Large group delay variation points to large distortion

## Large signal response (I)



 Full dynamic range at low frequency, compression appears when reaching the voltage threshold of the PN junctions at the drain/substrate barrier.



• Gain compression at lower and higher amplitudes

## Large signal analysis (II)

### High frequency gain compression & distortion



Three region of operation:

- Low distortion & High compression
- Moderate distortion & Moderate compression
- High distortion & High compression



### **Understanding signal response**



## **Understanding signal response**

### Moderate distortion & Moderate compression



Resistance of the channel is varying

 The bandwidth at instantaneous values
 of the incident voltage waveform is
 different

-> In frequency domain this gives rise to higher harmonics, which interfere constructively hence increasing the overall signal amplitude but also increases distortion



### Harmonic decomposition



- Constructive interference of odd harmonics and destructive interference of even harmonics at the peaks
- Constructive interference of second and third harmonics at zero crossing

### **Frequency domain decomposition**



### **Noise and Distortion**



• Noise dominated by the ON resistance of the channel

 Total noise is around 0.29mV ± 0.01 mV

# Noise, distortion and dynamic range

Signal to Noise Ratio at full scale input (1Vpp)



• SNR is around 61.7dB ± 0.3 dB

# **Distortion analysis**



**Distortion at fixed Frequency** 

Most of the distortion comes from the Ron variation over the input voltage range



# **Transient Response**



0.11ns

Pedestal error due to charge • injection and transistor mismatch dominate

Best case is 0.25ns or 4GHz

0.52ns

Worst case window time is 0.8ns or

1.25GHz -> due to low bandwidth

900mV

•

19

18.8

# Summary – Requirements comparison

| Measured (worst cases)  | Requirement                                           |
|-------------------------|-------------------------------------------------------|
| 1.7GHz @665Vdc @50Ω     | 3GHz                                                  |
| 1.0GHz @665Vdc @50Ω     | 3GHz                                                  |
| 61.7 dB                 | 58dB                                                  |
| 9.8 bits (small region) | 9.4 bits                                              |
|                         | 1.7GHz @665Vdc @50Ω<br>1.0GHz @665Vdc @50Ω<br>61.7 dB |

Things to improve:

- Reduce Ron variance over the dynamic range to reduce distortion and increase the ENOB
- Bandwidth dominated by Cin:
  - Reduce Cin or reshape the channel to increase the bandwidth (first pole)
  - Reduce Ron overall value to increase the bandwidth (second pole)
- Speed dominated by bandwidth:
  - Increase bandwidth
  - Overlapping of sampling cell windows to increase the effective sampling frequency
- Use differential configuration to reduce pedestal error and increase noise coupling and crosstalk immunity

### **Future Plans**

- Now in detector, not readout limited timing regime
- PSEC5 ASIC
  - $-256 \rightarrow 32k$  sample storage
  - Work to optimize bandwidth, ENOB
  - Persistence effects
- **RFpix ASIC** 
  - Push limits of ABW, timing
  - Below 100-200fs, direct spatial measurement becomes interesting
  - Many practical issues, but none fundamental (CF 1ps)
- DRS5, SAMPIC ASICs
  - Will be interesting to see how well can perform

# Back-up slides



### BLAB1 High speed Waveform sampling

- Comparable performance to best CFD + HPTDC
- MUCH lower power, no need for huge cable plant!
- Using full samples reduces the impact of noise
- Photodetector limited

NIM **A602** (2009) 438



### **SL-10 Timing Performance**

Hawai'i Nagoya run1351\_ch2\_projection timing Laser Scan Entries 15600 Entries 2245 500 of event Mean 16.22  $\chi^2$  / ndf 76.96/6 RMS 0.1382 600 Constant 656.6 ± 17.0  $\chi^2$  / ndf 375.8 / 42 400  $485.9 \pm 15.0$ Constant 500  $-0.65 \pm 0.04$  $\textbf{16.18} \pm \textbf{0.00}$ Mean Mean Sigma  $0.03837 \pm 0.00078$ 300 400  $1.554 \pm 0.028$ Sigma 300 200 σ~38.37 200 100 100 o **= 38.858866** ±. 0.697468517 ps 15.5 16.5 17.5 15 16 17 -40 20 time (ns) 25ps/count

- Nagoya = constant fraction discriminator + CAMAC ADC/TDC
- Hawai'i = waveform sampling + feature extraction

### **Design Choices**

- Input coupling
  - Differential versus single-ended input
  - Needed analog bandwidth
  - Gain needed?
- Sampling Options
  - On-chip PLL/DLL
  - External DLL
  - Analog transfer vs. interrogate in situ
- ADC and readout options
  - Sequential output select vs. random access
  - On-chip vs. off-chip ADC
  - Serial, parallel, massively parallel

Many variants have been explored...

### IRS/BLAB3 Single Channel

# • Sampling: 128 (2x 64) separate transfer lanes

Recording in one set 64, transferring other ("ping-pong")

• Storage: 64 x 512 (512 = 8 \* 64)

• Wilkinson (32x2): 64 conv/channel





#### Deeper storage: Buffered LABRADOR (BLAB1) ASIC



- Single channel
  - 64k samples deep,
     same SCA technique as
     LAB, no ripple pointer
- Multi-MSa/s to Multi-GSa/s
- 12-64us to form Global trigger

Arranged as 128 x 512 samples Simultaneous Write/Read

3mm x 2.8mm, TSMC 0.25um



# Simulated Performance vs. SNR

300MHz ABW, 5.9GSa/s



### **IRS Input Coupling**



- Input bandwidth depends on 2x terms

   f3dB[input] = [2\*π\*Z\*C<sub>tot</sub>]<sup>-1</sup>
  - $f3dB[storage] = [2^*\pi^*R_{on}^*C_{store}]^{-1}$

### Founding WFS ASIC References

- PSI activities (DRS)
  - IEEE/NSS 2008, TIPP09
  - http://midas.psi.ch/drs
- DAPNIA activities
  - MATDAQ: IEEE TNS 52-6:2853-2860,2005 / Patent WO022315
  - SAM; NIM A567 (2006) 21-26.
- Hawaii activities
  - STRAW: Proc. SPIE 4858-31, 2003.
  - PRO: JINST, Vol. 3, P12003 (2008).
  - LABRADOR: NIM A583 (2007) 447-460.
  - BLAB: NIM A591 (2008) 534-545; NIM A602 (2009) 438-445.
  - STURM: EPAC08-TUOCM02, June, 2008.

#### SAMPICO: a Waveform based TDC chip

#### SAMPIC0 : a 16 channel WTDC

- proof of concept chip already usable with detectors
- Test of CMOS AMS0.18μm (low cost, low leakage, 1.8V technology)
- Compatible with buffered architecture (deatime free) => future chips

#### Each channel Self-Triggerable to catch parameters of fast pulses:

- Timing :
  - Coarse = timestamp counter
  - Middle = DLL based TDC also defining a Zone of Interest for sampling
  - Fine = few samples in the ZOI of the sampled waveform
- Waveform Shape, Charge, Amplitude available through samples
- No need for high-end discriminator => low power, versatility
- Short SCA (to accommodate the delay of the discriminator)



AMS CMOS 0.18µm

#### **SAMPIC0** Architecture

- Common "Slow" (160MHz) 12-bit Gray Counter = Coarse Timestamping/ch
- Common Timing generator: servocontrolled DLL: (1-10 GHz) used for middle precision timing & analog sampling commands
- 16 (short) SCA self-triggerable channels:
  - No analog input buffer
  - 64 cells, ~ 50fF capacitor
  - 1.5 GHz Bandwidth
- Several modes of triggering: discri on threshold (+/-), External, Or...
- On-chip fast Wilkinson digitization :
  - 1.3 GHz common gray counter.
  - tunable ramp slope=> trade-off conversion time/precision 1.6µs/11bit to 200ns/8bit
  - Simultaneous conversion of all the SCA cells of the triggered channels



- Deadtime = only for triggered channels waiting or in conversion => independent DEADTIME (can be common if required)
- Read-Out through a 12 bit/160 MHz (up to 400) LVDS bus: negligible readout deadtime
- SPI for configuration (Trigger modes, discriminator thresholds (1/ch),...)



### Cascaded Switche Shift register

- 32 fast sampling cells (10 GSPS)
- 100 ps sample time, 3.1 ns hold time
- Hold time long enough to transfer voltage to secondary sampling stage with moderately fast buffer (300 MHz)
- Shift register gets clocked by inverter chain from fast sampling stage











#### Very first measurement results

 Performed with a 16-channel mezzanine-board compatible with the system (USB/Eth/optical) previously developed for the SAMLONG chips

100

Already usable for small/experiments or detector tests



1000

Freq (MHz)

# **SAMPICO: Summary**



Unit

|                                            |                                     | <b>O</b> |
|--------------------------------------------|-------------------------------------|----------|
| Technology                                 | AMS CMOS 0.18µm                     |          |
| Number of channels                         | 16                                  |          |
| Power consumption                          | 180 (1.8V supply)                   | mW       |
| Discriminator noise                        | 2                                   | mV RMS   |
| SCA depth                                  | 64                                  | Cells    |
| Sampling Speed                             | <3-8.4 (10.2 for 8 channels only)   | GSPS     |
| Bandwidth                                  | 1.6                                 | GHz      |
| Range (Unipolar)                           | 1                                   | V        |
| ADC resolution                             | 8 to 11 (trade-off time/resolution) | bit      |
| SCA noise                                  | <1.3                                | mV RMS   |
| Dynamic range                              | 9.6                                 | Bit RMS  |
| Conversion time                            | 0.2-1.6 (8bit-11bit)                | μs       |
| Readout time (can be probably be doubled ) | 25 + 6.2/sample                     | ns       |
| Time precision before correction           | 15                                  | pS RMS   |
| Time precision after timing INL correction | < 5                                 | pS RMS   |
|                                            |                                     |          |