





# Picosecond time measurement using ultra fast analog memories.

#### D.Breton & J.Maalmi (LAL Orsay), E.Delagnes (CEA/IRFU)



D.Breton, E.Delagnes, J.Maalmi - TWEPP 2009 - September 22<sup>nd</sup> 2009



Irfu CCC saclay

- The story begins in 1992 with the design of the first prototype of the Switched Capacitor Array (SCA) for the ATLAS LARG calorimeter. After 10 years of development, the main final characteristics of this rad-hard circuit were:
  - 12 pseudo-differential channels
  - 40 MHz sampling
  - 13.6-bit dynamic range with simultaneous write/read
  - 80000 chips produced in 2002 and mounted on the detector.
- Since 2002, 3 new generations of fast samplers have been designed (ARS, MATACQ, SAM): total of more than 30000 chips in use.
- Our design philosophy:
  - 1. Maximize dynamic range and minimize signal distorsion.
  - 2. Minimize need for calibrations and off-chip data corrections.
  - 3. Minimize **costs** (both for development & production):
    - Use of inexpensive pure CMOS technologies ( $0.8\mu m$  then  $0.35\mu m$ );
    - Use of packaged chips (cheap QFP).





lrfu

R)

saclay

- High dynamic range => 4-switch memory cells:
  - Voltage-mode writing.
  - Floating voltage-mode reading with read amplifier.
  - $\Rightarrow$  Gain and pedestal spread insensitive to capacitor mismatches.
  - Sequencing of S1-S2 switch opening.
  - $\Rightarrow$  Sampling time very well defined and independent of signal amplitude.



- Use of analog input buffer (voltage follower):
  - Keeps the real input impedance very high to avoid signal distortion
  - Penalty in power consumption and bandwidth.



I r f u

- Relatively high value of storage capacitance (200fF to 1pF):
  - minimize both kt/C and readout noise.
- Use of differential channels:
  - Coupling and noise rejection.
  - Low signal distortion.
  - Easier interface with modern commercial ADCs.
- Use of internally servo-controlled Delay Lines (DLL) to define the time steps:
  - No need for timing calibration for standard applications.
  - Stability with temperature.
  - With on-chip phase detector and charge pump, fast setup time for the servo-control is possible => sampling DLL





# The Sampling Matrix Structure: main features

Irfu CEC saclay





• Analog Bus is a RC delay line:

 $\Rightarrow$  delay depends on the sampling cell position.

 $\Rightarrow$ \_the overall bandwidth also, especially if it is not limited by an input amplifier or that of the intrinsic sampling cell.

• Short analog busses are better for BW uniformity => segmentation into parallel lines

=> much less distorsion



# The SAM (Swift Analog Memory) chip

• This chip was first designed for HESS2 experiment: a big Athmospheric Cerenkov Telescope located in the Namibia desert.



NIM A, Volume 567, Issue 1, p. 21-26, 2006

6000 ASICs delivered in Q2 2007, yield of 95%.

D.Breton, E.Delagnes, J.Maalmi - TWEPP 2009 – Septembe

- 2 differential channels
- 256 cells per channel
- BW > 250 MHz
- Sampling Freq: 700MHz-2.5GHz

lrfu

A

saclay

- High Readout Speed >16 MHz
- Smart Read pointer (integrate a 1/Fs step TDC)
- Few external signals
- Many modes configurable by a serial link.
- Auto-configuration @ power on
- Low cost for medium size prod=> AMS 0.35 μm





# The USB WaveCatcher prototype board

lrfu  $\mathbf{E}$ saclay

Reference clock:







• This board was first designed for reflectometry applications.

Pulsers for reflectometry applications

- At the same time, we got involved in an worldwide picosecond working group.
  - Analog memories seemed to be perfect candidates for precision measurements ...
- => we decided to try to push the board's performances to their maximum!





# • No offline correction except the subtraction of the fixed pedestal distribution



75 mV amplitude, 1ns FWHM pulse. 3.2GS/s





2ns FWHM consecutive pulses, separated by 22ns, (300mV & 170mV amplitude). 3.2 GS/s

The goal of the following study is to measure the board's capacity to perform the measurement of the time difference between two pulses (like a TDC but directly with analog pulses !).



Irfu CCCC saclay

- ENOB is not Log( Max signal/Noise)/ Log(2) as often said.
- ENOB = (10 Log (sinus power / residues power) 1.76)/6.02.
- Depends on input sinewave frequency, noise & jitter.
- Contribution of jitter to ENOB =  $(20 \text{ Log } (2.\pi .\sigma .F_{\text{sine}}))-1.76)/6.02$ .





I r f u

Mismatches of elements in the delay chain induce:

=> dispersion of delay duration

=> error on the sampling time.

Fixed for a given tap => "Fixed Pattern Aperture Jitter"

- Dispersion of single delays => time DNL.
- Cumulative effect => time INL. Gets worse with delay line length.
- Systematic effect => non equidistant samples (bad for FFT).
  - => correction with Lagrange polynomial interpolation. Drawbacks: computing power.
  - => good (and easy) calibration required.





- 2 sources of aperture jitter:
  - Random Aperture Jitter (RAJ).
  - Fixed Pattern aperture Jitter (FPJ).
- Inside the DL the jitters are cumulative. Assuming there is no correlation:
- For RAJ, the aperture jitter @ tap j will be

$$\sigma_{Rj} = \sqrt{j} \cdot \sigma_{Rd}$$
 if  $\sigma_{Rd}$  is the random jitter added by a delay tap

• For FPJ  $\sigma_{FPj} = \sqrt{j} \cdot \sigma_{FPd}$  for a free running system

$$\sigma_{FPj} = \sqrt{\frac{j.(N-j)}{N}}.\sigma_{FPd}$$
 if the total delay is servo-controlled

if  $\sigma_{FPd}$  is the fixed pattern jitter added by a delay tap ( $\sigma_{DNL}$ ) and N is the DL length.

# Short and servo-controlled DL => Less Jitter (both kinds)

lrfu

Ð

saclay



#### Block diagram of clock distribution on the prototype

Irfu CEC





- Method: 135MHz-1.4Vpp sine-wave sampled by SAM
- Search of zero-crossing segment => length and position (cell).
  - Higher frequency => 320-ps segments are not straight enough
  - Lower frequency => more jitter because of noise
- Histogram of length[position]:
  - propor. to time step duration assuming sine = straight line (bias ~ 1ps rms).
  - mean\_length[position] = fixed pattern effect => DNL => INL
  - sigma\_length[position] = random effect => Random Jitter



Irfu



# Fixed pattern jitter

- l r f u
- DNL => mean segment lengths. Modulo 16 pattern. Integrated and fitted to measure the INL.



- INL => segments have a modulo 16 pattern + slow pattern.
  - Used for third degree Lagrange polynomial correction of data
- Advantage of servo-controlled structure: very small dependence to time and temperature



#### Random jitter





- Very encouraging random jitter floor ~ 2 ps rms
- But peaks on "transition" samples up to ~20 ps (mean jitter ~3ps)
  - Understood: due to the clock jitter, which can be seen only on the last cell of the DLLs
  - The oscillator is supposed to deliver a clean clock = > probable source: the FPGA



#### First block diagram of clock distribution

Irfu





# Fixed pattern jitter with direct clock connection

lrfu

CED



• Slight improvement on DNL and INL ...



# Random jitter with direct clock connection





Huge improvement on "transition" samples (now 3 to 3.5 ps max).
 => The FPGA indeed adds a lot of jitter to the clock !

```
=> Mean jitter now ~ 2.2 ps rms ...
```



# But also ...



• DNL & INL also randomly exhibit 3 different modulo 4 patterns at power-up !

- Example:



- It took us a certain amount of weeks to understand the problem !
  - It is actually due to a coupling between the temporal structure of the current consumed by the PFGA core (+1.5V supply) and the SAM write clock !
- Coupling location still remains uncertain. Probably inside the PCB power planes.
- Now, how to get rid thereof ? ...



#### First block diagram of clock distribution

l r f u





## Fixed pattern jitter with N=11 and M=21





• No big difference in DNL and INL !



# Random jitter with N=11 and M=21



lrfu

E

saclay

=> but no more modulo patterns !!!

- Now the results are perfectly reproducible
- The INL correction seems to be stable over a long period of time (days at least) ⇒ could be stored in EEPROM on-board like the cell pedestals
- The correction works rather well for other input frequencies between 100 and 200MHz, with a residual INL always remaining below 2.5ps.

=> this validates the correction method.



# Block diagram of clock distribution on the new board

Irfu CEC





- Irfu CCCC saclay
- Source: asynchronous pulse summed with itself reflected at the end of an open cable.
- Time difference between the two pulses extracted by crossing of a fixed threshold determined by polynomial interpolation of the 4 neighboring points (on 3000 events).



 $\sigma_{\Delta t} \sim 11 \text{ps rms} => \text{jitter for a single pulse} = 8 \text{ ps }!$ 





- 2 DC-coupled 256-deep channels with 50-Ohm active input impedance
- $\pm 1.25$ V dynamic Range, with full range 16-bit individual tunable offsets
- 2 individual pulse generators for reflectometry applications.
- On-board charge integration calculation.
- Bandwidth > 500MHz
- Signal/noise ratio: 11.9 bits rms (noise =  $630 \mu V RMS$ )
- Sampling Frequency: 400MS/s to 3.2GS/s
- Max consumption on +5V: 0.5A
- Absolute time precision in a channel (typical):
  - without INL calibration:
  - after INL calibration

- 20ps rms (400MS/s to 1.6GS/s) 16ps rms (3.2GS/s) 12ps rms (400MS/s to 1.6GS/s) 8ps rms (3.2GS/s)
- Relative time precision between channels: still to be measured.
- Trigger source: software, external, internal, threshold on signals
- Acquisition rate (charge mode)
- Acquisition rate (full events) Up to  $\sim 1.5$  kHz over 2 full channels Up to  $\sim 40$  kHz over 2 channels

Acquisition software with graphical interface will be available soon



lrfu

(e)

saclay

See

dedicated

poster

(+ demo)



Irfu CEC saclay

- We are collaborating to the design of a new TDC in the IBM 130nm technology
- This is a collaboration between the University of Chicago, Orsay and Saclay
- The goal is to reach the ps precision thanks to the addition to an usual DLL-based TDC of analog memories sampling at very high frequency (20GS/s).
- Input clock frequency should be 312.5 MHz.



- A first prototype has been submitted by our colleagues of Chicago in June 2009
  => it includes all the elements of a measurement channel except the discriminator.
- In the near future, we aim at building a 16-channel chip.



Irfu

- We built a USB board to push the SAM chip towards its limits.
- Timing measurements showed a resolution of ~16 ps rms without time INL correction, and less than 10ps after correction (SAM wasn't even designed to this end !).
- The board will soon be tested with MCPPMT's for low-jitter light to time conversion.
- The first boards of the last version are being cabled this week.
- Tests showed us that analog memories look perfectly suited for ps time measurement.
  - => no need for analog to digital pulse conversion, low power and low cost !
- This experience gave us new guidelines for future chips to improve timing performances.
  - We are now convinced that a single chip can't be optimum for all applications (long depth vs time precision).
  - A new chip based on SAM was submitted last week to test higher bandwidth and lower power.
  - Next circuit will be submitted at the end of the year: 16 channels, 4-5GS/s sampling freq, larger BW (700MHz ?), larger depth (512 pts/ch), same techno (pure CMOS 0.35µm),
- A new 130-nm pico-second TDC using ultra-fast analog memories is under design.