

# Ps workshop, Kansas City, September 17th 2016







,

# STATUS OF DEVELOPMENTS ON THE SAMPIC WAVEFORM TDC

# <u>D. Breton<sup>2</sup>, E. Delagnes<sup>1,</sup></u>, H. Grabas<sup>1,3</sup>, O. Lemaire<sup>2</sup>, J. Maalmi<sup>2</sup>, P. Rusquart<sup>2,</sup> P. Vallerand<sup>2</sup>

- <sup>1</sup> CEA/IRFU Saclay (France)
- <sup>2</sup> CNRS/IN2P3/LAL Orsay (France)
- <sup>3</sup> Now with SCICPP Santa Cruz (USA)

This work has been funded by the P2IO LabEx (ANR-10-LABX-0038) in the framework « Investissements d'Avenir » (ANR-11-IDEX-0003-01) managed by the French National Research Agency (ANR).



### **INTRODUCTION**

- SAMPIC\_V1 was submitted in November 2014 and considered as releasable ~ one year ago.
- Different groups then started using the modules on their test benches or in test beam.
- On our side, we pursued the characterization of the chip.
- We got a lot of feedback concerning the chip, the module and the software. A raising point was that it was more urgent to improve the system aspects than to concentrate on the already nice time resolution.
- During the last 12 months, we thus mostly work on performing many improvements on digital electronics (SAMPIC & FPGA) and software.
- I will concentrate here on the aspects concerning the chip evolutions.

### THE « WAVEFORM TDC » CONCEPT (WTDC)

**WTDC**: a TDC which permits taking a picture of the signal. This is done via sampling and digitizing the interesting part of the signal.

Based on the digitized samples, making use of a digital algorithm, fine time information can be extracted.





#### Advantages:

- Time resolution ~ few ps
- No "time walk" effect
- Possibility to extract other signal features: charge, amplitude...

#### Drawbacks:

- dead-time linked to conversion and readout which doesn't permit counting rates as high as with a classical TDC

### THE « WAVEFORM TDC » STRUCTURE

- Overall time information is obtained by combining 3 times :
  - **Coarse** = Timestamp Gray Counter (few ns step)
- - Fine = samples of the waveform (digital algorithm will give a precision of a few ps)
  - Discriminator is used only for triggering, not for timing => no jitter added on measurement, low power
  - Digitized waveform available to extract other parameters (Q, amplitude,...)



### **GLOBAL ARCHITECTURE**



One Common 12-bit Gray
 Counter (FClk up to 160MHz) for
 Coarse Timestamping.

One Common servo-controlled
 DLL: (from 1.6 to 10.2 GHz) used
 for medium precision timing & analog sampling

• 16 independent WTDC channels each with :

✓1 discriminator for self triggering
✓ Registers to store the timestamps
✓ 64-cell deep SCA analog memory
✓ One 11-bit ADC/ cell
(Total : 64 x 16 = 1024 on-chip ADCs)

• One common 1.3 GHz oscillator + counter used as timebase for all the Wilkinson A to D converters.

Read-Out interface: 12-bit LVDS
bus running at 160 MHz (2 Gbits/s)
SPI Link for Slow Control

### ANALOG MEMORY (SCA) IN EACH CHANNEL

- 64-cell deep, trade-off between:
  - Time precision / stability (=> short)
  - Bandwidth uniformity (=> short)
  - Time available for trigger latency (=> long)
- No input buffer, single ended
- 3-switch cell structure to reduce leakages and ghosts.
   Switch 3 also isolates from input bus during conversion
- ~ 1 V usable range, > 1.5 GHz BW



 Trigger position marked on DLL cells => medium precision timing and used for Optional Region of Interest Readout (only few samples read)





### The "3-switch" memory cell

Layout

size :  $20x10 \ \mu m^2$ 

#### Design constraints:

- Settling time at 10<sup>-3</sup> within 800ps (8 cells @10GS/s)
- Bandwidth > 1.5 GHz
- Non linearity < 1%
- Dynamic range ~ 1V
- 3 switch memory cell developed for reducing the leakage currents and the ghost effect (residue of event N-1 on event N)

Charge injections when opening switch 2 => major source for Integral Non Linearity => size of switch 2: tradeoff between RDSon / Bandwidth ( $\rightarrow$  wide W<sub>MOS</sub>) Injected charge / INL ( $\rightarrow$  narrow W<sub>MOS</sub>)





Simulation results: INL ~0,2% max for Vin from 50mV to 1V

### SAMPIC LAYOUT



- Technology: AMS CMOS 0.18µm
- Surface: 8 mm<sup>2</sup>
- Package: QFP 128 pins, step of 0.4mm



PAGE 8

### SAMPIC LAYOUT





- Technology: AMS CMOS 0.18µm
- Surface: 8 mm<sup>2</sup>
- Package: QFP 128 pins, step of 0.4mm



| PAGE 9

### **PROTOTYPING & ACQUISITION SETUP**

- 32-channel module integrating 2 mezzanines
- 1 SAMPIC/mezzanine
- USB, Gbit Ethernet UDP (new: special secured version developed by Jihane with no data loss)



- Acquisition software (& soon C libraries)
- => full characterization of the chip & module
- Timing extraction (dCFD, interpolation...)
- Special display for WTDC mode
- Data saving on disk.
- Used by all SAMPIC users.





### SAMPIC PERFORMANCES

- Wilkinson ADC works as expected with 1.3 GHz clock
- Dynamic range of 1V with a 0.5mV LSB when coding over 11 bits
- Gain dispersion between cells ~ 1% rms
- Non linearity <1.4 % peak to peak







- After correction of each cell (linear fit) :

  → noise = 0.95 mV rms (∀ Fech)
  ≡ ~10 bits rms of dynamic range

  Discriminateur noise ~ 2 mV rms
  Power consumptio: 10mW/channel
  3dB bandwidth: 1.6 GHz
  Counting rate > 2Mevts/s (full chip, full
  - waveform), up to 10 Mevts/s with ROI

### Time Difference Resolution (TDR)



### Timing Difference Resolution (TDR)



### $\Delta T$ RESOLUTION VS DELAY





- TDR < 5 ps rms after time correction.
- TDR is constant for  $\Delta t > 10$ ns

### $\Delta T$ RESOLUTION VS DELAY





- TDR < 5 ps rms after time correction.</li>
  - TDR is constant for  $\Delta t > 10$ ns

 ~ unchanged when using 2 chips from 2 mezzanines (slope here comes from slower risetime of 800ps)
 => measurement are uncorrelated
 => channel single pulse timing resolution is < 3.5 ps rms ( 5 ps/√2)</li>
 From these 2 types of measurements, we could extract the jitter from the motherboard clock source: ~ 2.2 ps rms
 => SAMPIC's own jitter < 2.5 ps rms</li>

PAGE 15

### **TRICKS FOR UNDERSTANDING RESOLUTION**

- This is how we measure the contributions to the resolution: we run at 6.4 GS/s, send two 500 mV pulses separated by 2.5 ns to two channels:
  - 1. of the same mezzanine
  - 2. of two different mezzanines

Same chip

- From this we can extract that the jitter contribution is:
- ~ 1.5 ps rms from the DLL
- ~ 1.8 ps rms from the clock distribution on the motherboard
- ~ 2.4 ps rms from the clock distribution on the mezzanine





Different chips

#### TIMING RESOLUTION VS RATE

1ns FWHM, 400ps risetime, 0.7V signals sent to 2 channels of SAMPIC

- 7.1ns delay by cable, 6.4 GS/s, 11-bit mode, 64 samples, both INLs corrected
- Rate is progressively increased.



The measured delay and its resolution are stable for channel rates up to 2 MHz

### TIMING RESOLUTION (DIGITAL CFD) VS ADC NUMBER OF BITS

- In order to minimize dead-time, ADC number of bits can be reduced: factor 2 for 10 bits (800 ns), 4 for 9 bits (400 ns), 8 for 8 bits (200 ns), 16 for 7 bits (100 ns).
- Looking at the effect of the ADC number of bits on time resolution...
- Signal amplitude is the key element in this case: time resolution degrades for small signals since quantification noise becomes dominant
  - There is **very small loss** in performance between 11, 10 and 9-bit modes.
  - Where quantization noise dominates, other methods than dCFD can be used ...



No degradation on timing for pulses above 100mV for 8 bits & 50mV for 9 bits

### TIMING RESOLUTION VS AMPLITUDE & RISETIME 1-NS FWHM - 15 NS DELAY, DIGITAL CFD ALGORITHM



- 2 zones: sampling jitter or S/N limited zones.
- TDR < 8 ps rms for pulse amplitudes > 100mV
- TDR < 20 ps rms for pulse amplitudes > 40 mV
- Can be improved by using mores samples (if feasible and uncorrelated) since dCFD uses only 2 samples

### TAKING DATA WITH DETECTORS

- SAMPIC modules are already used with different detectors on **test benches or test beams.** Unfortunately, very little public data available until now ...
- Tested with **PMTs, MCPPMTs, APDs, SiPMs, fast Silicon Detectors, Diamonds**: performances are equivalent to those with high-end oscilloscopes
- Different R&Ds ongoing with the **TOF-PET** community (CERN, ...)
- SAMPIC has been used for test beams of TOTEM and SHIP at CERN
- It is also used for fast mesh-APD characterization and test beams
- **TOTEM** is currently developping a CMS-compatible motherboard housing SAMPIC mezzanines
- **SHIP** is testing SAMPIC for its fast timing detector. SAMPIC option is described in the technical proposal for the fast timing detector and calorimeter (two-gain version)
- SAMPIC is in use at Giessen for **PANDA EndCap DIRC** caracterization.
- It will soon be used for ATLAS HGTD

#### NIM PAPER RECENTLY SUBMITTED

#### Nuclear Instruments and Methods in Physics Research A 835 (2016) 51-60



#### Measurements of timing resolution of ultra-fast silicon detectors with the SAMPIC waveform digitizer



D. Breton<sup>a</sup>, V. De Cacqueray<sup>b,1</sup>, E. Delagnes<sup>b</sup>, H. Grabas<sup>c</sup>, J. Maalmi<sup>a</sup>, N. Minafra<sup>d,e,2</sup>, C. Royon<sup>f</sup>, M. Saimpert<sup>b,\*</sup>

\* CNRS/IN2P3/LAL Orsay, Université Paris-Saday, F-91898 Orsay, France

<sup>b</sup> IRRJ, CEA, Université Paris-Saclay, F-91191 Gif-sur-Yvette, France

<sup>6</sup> Santa Cruz Institute for Particle Physics UC Santa Cruz, CA 95064, USA

<sup>d</sup> Dipartimento Interateneo di Fisica di Bari, Bari, Italy

\* CERN, Geneva, Switzerland

<sup>1</sup> University of Kansas, Lawrence, USA

#### ARTICLE INFO

Article history:

Received 8 April 2016 Received in revised form 1 August 2016 Accepted 7 August 2016 Available online 10 August 2016

#### Keywords: ASIC Time-of-flight Time to digital converter Waveform sampling Time resolution Silicon detector

#### ABSTRACT

The SAMpler for PICosecond time (SAMPIC) chip has been designed by a collaboration including CEA/ IRFU/SEDI, Saclay and CNRS/LAL/SERDI, Orsay. It benefits from both the quick response of a time to digital converter and the versatility of a waveform digitizer to perform accurate timing measurements. Thanks to the sampled signals, smart algorithms making best use of the pulse shape can be used to improve time resolution. A software framework has been developed to analyse the SAMPIC output data and extract timing information by using either a constant fraction discriminator or a fast cross-correlation algorithm. SAMPIC timing capabilities together with the software framework have been tested using pulses generated by a signal generator or by a silicon detector illuminated by a pulsed infrared laser. Under these ideal experimental conditions, the SAMPIC chip has proven to be capable of timing resolutions down to 4 ps with synthesized signals and 40 ps with silicon detector signals.

© 2016 Elsevier B.V. All rights reserved,

#### http://arxiv.org/abs/1604.02385

### **RECENT DEVELOPMENTS**

- Intermediate version of the chip submitted in November 2015
  - Improved DLL and buffer design (especially for 10GS/s sampling)
  - Nb of bits for coarse timestamp => 16 bits
  - Improved "central trigger" (multiplicity of 2 & OR) with possibility of common deadtime
  - Posttrig now runs over the full sampling window
  - All DACs necessary for controlling the chip have been integrated
    - ADC resolution internally selectable between 7 and 11 bits
  - Integrated TOT measurement
  - "Ping-Pong" (toggling) mode: channels work in pairs. As soon as a channel is fired, the second takes the hand. If the same signal is sent to both channels, this permits reducing the instantaneous dead-time to a few ns.
  - **Translator input block** to deal with any digital signal (unipolar or differential)
- New version of the module motherboard with new features
- New versions of the daughterboards adapted to the new chips
- Constant improvements of Firmware and DAQ software
  - Embedded firmware CFD extraction is being studied
- 64-channel board and 256-channel (512?) mini-crate under study.

### **TRANSLATOR INPUT BLOCK**

- Goals: performing a precise timing of the leading edge of digital unipolar or differential signal and measuring its width (TOT) without adding dead-time
- Digital input signal (LVDS downto SLVS or any unipolar) is translated to analog and adapted to the dynamic range of the sampling cell (has to be low power!)
- Fixed amplitude => only a few samples (ROI) and fast conversion (  $\leq 8$  bits)
- Gives the possibility of an autonomous time calibration for integrated systems



### **TOT MEASUREMENT**

- This addresses the need for measuring the width of signals longer than the sampling window (which corresponds roughly to one main clock period: 6ns @10GS/S to 40ns @1.6GS/s)
- Solution: addition of a 65<sup>th</sup> cell which specifically measures the « Time Over Threshold » of the input signal thanks to a dedicated ramp generator. The result is simply converted by the ADC in parallel with all the waveform samples.



Measurement ranges between 4 and 300 ns.

### **NEW DAUGHTERBOARDS DEVELOPMENT**



- New mezzanine cards have been
  developed for housing the new
  versions of the chip (including the
  digital differential option)
  - 1. Analog/digital input with MCX
  - 2. Analog/digital input with flat cable
  - 3. Differential digital input with flat differential cable
- Adaptors have also been developed





### PRELIMINARY RESULTS OF NEW BLOCKS



### **TRIGGER OPTIONS**

- One very low power signal discriminator/channel
- One 10-bit DAC/channel to set the threshold (which can be external)
- Several trigger modes programmable for each channel:



## PROGRAMMABLE DELAY WITH STEP CORRELATED TO SAMPLING FREQUENCY

We need asynchronous delays relative to sampling frequency (gate, posttrig, ...)

- Constraint: small layout footprint (4 blocks per channel), low power
- ⇒ Delay Locked Loop based on a ramp monostable. This permits delivering a simple servo-controlled ramp voltage to the whole chip
   ⇒ Slave blocks with programmable slope



### WILKINSON ADC WITH NEW AUTO-CONVERSION MODE



#### • When triggered, each channel launches its auto-conversion.

- When ramp starts, the value of the continuously running counter is sampled in a dedicated channel register
- When the ramp crosses the cell voltage => the current value of the counter is stored in the cell register (ramp offset).
- As soon as all discriminators of the channel have fired, Analog to
   Digital conversion of the channel is over => optimization of dead time
- During readout, the ramp offset is read before the channel waveform samples.

In "auto-conversion" mode, the ramp offset will be subtracted from the value of the waveform samples.

### **NEW FEATURES IN NEXT VERSION**

- A new version of SAMPIC will be submitted in November.
- Pin to pin compatible with the already existing mezzanines: "only" firmware and software modifications.
- **Redesign of central trigger** to reduce its time response
- **Redesign of translator input block** to improve the shape of the signal
- Channel chaining option: user-defined sets of channels can be chained in time. This permits either increasing the sampling depth of a single channel or studying correlated events on different channels.
- Saturation was not clean in the former versions. It will now happen in a clean way, whatever the conversion mode, thus permitting an easier (wider) dynamic range definition.
- Auto-calibration (ADC and Time INL): a dedicated (DAC + buffer) and a high frequency signal source are implemented in the chip in order to perform both calibrations in standalone.
- Wide range DLL: 3 different sizes of starving transistors can be selected in the main DLL in order to optimize its INL and jitter depending on the chosen sampling frequency

### CONCLUSION

### SAMPIC has been considerably enriched since November 2014, becoming an actual integrated system, setup by 26 16-bit registers:

- An intermediate version permitted testing many new functionalities:
  - Corrected 10GS/s with new buffers
  - All DACs and current sources integrated
  - New central trigger
  - New posttrig, sampling frequency dependent
  - TOT measurement
  - Auto-conversion
  - Ping-Pong
  - Translator input block
  - New version will be submitted in November, including:
    - Channel chaining
    - Autonomous calibrations (ADC and time INL, TOT)
    - Clean saturation
    - Wide range DLL
- Users feedback was extremely valuable to define these functionalities.
- Now we should be able to reconcentrate on the ps !
- Next possible step: increasing the output data flow ...





- Why the hedgehog ?
- SAMPIC (Sampling Analog Memory for PICosecond timing)



### SAMPIC: PERFORMANCE SUMMARY

|                                                       |                                     | Unit     |
|-------------------------------------------------------|-------------------------------------|----------|
| Technology                                            | AMS CMOS 0.18µm                     |          |
| Number of channels                                    | 16                                  |          |
| Power consumption (max)                               | 180 (1.8V supply)                   | mW       |
| Discriminator noise                                   | 2                                   | mV rms   |
| SCA depth                                             | 64                                  | Cells    |
| Sampling speed                                        | 1 to 8.4 (10.2 for 8 channels only) | GSPS     |
| Bandwidth                                             | 1.6                                 | GHz      |
| Range (unipolar)                                      | ~ 1                                 | V        |
| ADC resolution                                        | 7 to 11 (trade-off time/resolution) | bits     |
| SCA noise                                             | < 1                                 | mV rms   |
| Dynamic range                                         | > 10                                | bits rms |
| Conversion time                                       | 0.1 (7 bits) to 1.6 (11 bits)       | μs       |
| Readout time / ch @ 2 Gbit/s (full waveform)          | 450                                 | ns       |
| Single Pulse Time precision before correction         | < 15                                | ps rms   |
| Single Pulse Time precision after time INL correction | < 3.5                               | ps rms   |

### **BACKUP SLIDES**

# OUR FORMER DEVELOPMENTS OF ANALOG MEMORIES FOR WAVEFORM DIGITIZING

HAMAC

1998-2002 DMILL

MATACO

2000-2003

CMOS 0.8µ

- We design analog memories since 1992 => first prototype (PIPELINE V1) of the SCA for the ATLAS LARG calorimeter.
  80,000 HAMAC chips (2002) are on duty on the LHC.
- Since 2002, 3 new generations of fast samplers: ARS, MATACQ, SAM => more than **30,000 chips in use**.
- Our favourite structure is a sampling matrix.
- A few ps time resolution was demonstrated at system level (up to 64 channels) with SAMLONG, but deadtime can be a limitation.



### THE SAMPIC PROJECT

- Generic R&D funded by "P2IO Labex" grant
- Initially intended as a common prototype ASIC for high precision time of flight measurement (5 ps rms) in ATLAS AFP and SuperB FTOF



- Goals for the first prototype (SAMPIC0, received in June 2013):
  - Validation of the Waveform TDC structure
  - Evaluation of AMS 0.18μm technology for mixed design
  - Design of a multichannel chip usable in a real environment
     => connected to detector with a real readout and DAQ system
- Core of a future "dead-time free" chip

### WHY AMS 0.18M ?

- Based on IBM 0.18µm : IBM quality & documentation
- Good Standard Cells Library
- Good lifetime foreseen (HV module, automotive)
- 1.8V power supply: nice for analog design/ high dynamic range
- Reasonable leakages
- Good noise properties ( already checked with IdefX chips for CdTe)
- Reasonable radiation hardness
- Less complex (and less expensive) than IBM 0.13μm
- AMS high quality Design Kit
- Easy access (CMP, Europractice, AMS)
- Very low cost

### SAMPICO: XTALK MEASUREMENT

- 800mV, 1ns FWHM, 300ps risetime and falltime injected on **channel 7(blue)**
- Signal measured on the other channels
- Xtalk = derivative and decrease as the distance to the injection channel
- Xtalk signal is bipolar with ~ equal positive and negative lobe
- Similar plot, but shifted if injection in another channel (red)





### **READOUT PHILOSOPHY**

- Readout driven by Read and Rck signals => controlled by FPGA
- Data is read **channel by channel** as soon it is available
- Rotating **priority mechanism** to avoid reading always the same channel at high rate
- Optional Region Of Interest readout to reduce the dead time (nb of cells read can be chosen dynamically)
- Readout of converted data through a 12-bit parallel LVDS bus including:
  - Channel Identifier, Timestamps, Trigger Cell Index
  - The cells (all or a selected set) of a given channel sent sequentially
  - Standard readout at 2 Gbits/s
  - => Rate > 2 Mevts/s (full waveform)
  - Channel is not in deadtime during readout, only during conversion (data register is really a buffer stage)



### **CALIBRATION PHILOSOPHY**

- SCAs-based chips exhibit reproducible non-idealities which can be easily corrected after calibration:
  - The goal is to find the set with the **best performance/complexity ratio**.
  - But also to find the right set for the **highest level of performance**.
- SAMPIC actually offers very good performance with only two types of simple calibrations :
  - Amplitude: cell pedestal and gain (linear or parabolic fit) => DC ramp
  - Time: INL (one offset per cell) => use of a simple sinewave (see backup)
  - This leads to a limited volume of standard calibration data (4 to 6 Bytes/cell/sampling frequency => 5 to 8 kBytes/chip/sampling frequency)
     => can be stored in the on-board EEPROM (1Mbit).
- These simple corrections could even be applied in the FPGA.
- Highest level calibrations permit debugging the chip and pushing the performance to its limit (still unknown).

### **TIMING NON-LINEARITIES**

- Dispersion of single delays => time DNL
- **Cumulative effect** => **time INL**. Gets worse with delay line length.
- Systematic & fixed effect => non equidistant samples => Time Base Distortion

If we can measure it => we can correct it !

But calibration and even more correction have to remain "simple".



### TIME INL CALIBRATION AND CORRECTION



Method we introduced in 2009 and used since for our analog memories, assuming that a sinewave is nearly linear in its zero crossing region: much more precise than statistical distribution

• Search of zero-crossing segments of a free running asynchronous sine wave

#### => length[position]

- Calculate the average amplitude for zero-crossing segment for each cell.
- Renormalize (divide by average amplitude for all the cells and multiply by the clock period/number of DLL steps)
  => time duration for each step = "time DNL"
- Integrate this plot:
- ⇒ Fixed Pattern Jitter = correction to apply to the time of each sample = "time INL"

#### Time INL correction:

- Simple addition on  $T_{\text{sample}}$
- Also permits the calculation of real equidistant samples by interpolation or digital filtering.

### $\Delta T$ RESOLUTION VS DELAY



### EXPLORING LARGER DELAYS: TOWARD AN « ABSOLUTE » TIME MEASUREMENT

- Now we use 2 channels of a TEK AFG 3252 arbitrary waveform generator and program their relative delay (10-ps steps)
- Slower than the previous generator (2.5ns risetime min)
- <u>TEK</u>AFG 3252 <u>is specified</u> for an absolute precision of few 10 ps delay and a 100ps jitter
- => Measurements are clearly MUCH better



TDR is < 10 ps rms, even for delays up to 10  $\mu$ s => 1-ppm RESOLUTION

Difference between AFG programmed delay and measured value is < +/-15ps