# A 64-channel ASIC for TOFPET applications

Manuel D Rolo, Ricardo Bugalho, Fernando Gonçalves, Angelo Rivetti, Giovanni Mazza, José C Silva, Rui Silva and João Varela

Abstract-A 64-channel ASIC for TOF PET imaging is presented. The circuit provides time and energy measurements of events produced by a SiPM coupled to a L(Y)SO fast scintillator. This ASIC is developed in the framework of the EndoTOFPET-US collaboration as an option for the readout of external 200x200 mm plate detector, which consists of 3x3x15 mm crystals and 3x3 mm (active area) SiPMs. Using the chip with non-segmented and/or higher light yield crystals is possible. The same applies for photodetectors with different gain, polarity, or even higher dark count rate. The targeted 200 ps timing resolution for the system and the need for a low power consumption have driven the choice of a closed-loop amplifier input stage and a 50 ps time binning TDC based on analogue interpolation. A power consumption between 5 to 10 mW per channel is expected to guarantee a SNR of at least 20 dB for the single photon, using a SiPM with 320 pF terminal capacitance.

*Index Terms*—Time-of-flight, PET, Medical Imaging, SiPM, IC Readout Electronics.

#### I. INTRODUCTION

**T** IME-OF-FLIGHT information on PET systems allows for unprecedented sensitivity and spatial resolution, as the signal-to-noise ratio and thus the background rejection is significantly improved. The very high gain of the silicon photomultiplier (SiPM) and its sensitivity to single photon hits makes it a good candidate for highly compact systems. The EndoTOFPET-US project [1] aims a 200 ps coincidence timing resolution, which is enough to confine the positron annihilation coordinate along the line-of-response (LOR) with a FWHM position uncertainty of  $\approx 30 \ mm$ , on a dual-head PET detector. Achieving this fine resolution calls for fast frontend electronics, capable of extracting a very precise time stamp of each event.

Scintillation light statistics, which include intrinsic timing characteristics of the crystal and the travel path of the photons, along with the signal transit time spread in the SiPM, may become a source of jitter that could ultimately compromise the targeted time resolution. In fact, the signal shape fluctuation at the output of the photodetector reflects the statistical time distribution of each photon building up the signal. Since the arrival time of these photons is weakly correlated to the time

Manuscript received November 16, 2012. The research leading to these results has received funding from the European Union Seventh Framework Programme [FP7/2007-2013] under Grant Agreement n<sup>o</sup> 256984

M. D. Rolo, R. Bugalho, J. C. Silva, R. Silva and J. Varela are with LIP -Lab. Instrumentação e Física Experimental de Partículas, Lisboa, Portugal

F. Gonçalves is with INESC-ID - Inst. de Engenharia de Sistemas e Computadores, Lisboa, Portugal

A. Rivetti and G. Mazza are with INFN - Istituto Nazionale di Fisica Nucleare, Torino, Italia

R. Bugalho and J. Varela are also with IST UTL - Instituto Superior Técnico, Lisboa, Portugal

M. D. Rolo is also with Università degli Studi di Torino, Torino, Italia. E-mail: mrolo@lip.pt



Fig. 1. Readout chain of the external PET plate is based on LYSO crystals and SiPMs - customized readout ASICs developed

of the decay, the readout system must be able to trigger on the first photo-electron. This ability requires low-noise front-end electronics with enough bandwidth such that the time walk across the dynamic range becomes negligible.

On the other hand, the design of compact PET detectors poses strict limits on power consumption. This constraint has motivated the choice of a low-power input stage and a very low-power time-to-digital converter with a time binning of 50 ps.

Dedicated ASICs have been developed as options for the readout chain of the external PET plate (1).

## II. READOUT CHANNEL ARCHITECTURE

The readout architecture of the TOFPET ASIC is based on a dual-threshold applied to two replicas of the amplified input signal.



Fig. 2. Schema of the TOFPET channel: A 50 ps timing resolution stamp and the ToT data for every valid event are buffered and collected by the chip global controller.

The channel architecture comprises two independent branches with trigger thresholds that can be set separately. That allows a very low threshold to generate a trigger on the 1st p.e., which is used for an accurate measurement of the time  $t_0$  (Fig. 3). A higher threshold is then used to validate the pulse, otherwise the event is discarded. In case the event is of interest, then the time stamp  $t_2$  is latched with a second TDC and a valid event flag is issued.

The mechanism uses TDCs based on analogue time interpolation, which can provide good timing accuracy without requiring an implementation in a very deep sub-micron technology. This option results in a very low power consumption, with respect to a DLL-based converter. The inherent lower speed of these TDCs is compensated using a dual-TDC per channel, to detect independently the timing of the rising and the falling edges of the signal. Since only valid events are read, the amount of data to be processed by the chip back-end is dramatically decreased, compared to an approach where the discard of false hits would be decided based on the calculated value of the time-over-threshold (ToT).



Fig. 3. Simulation result including statistical information of the scintillation light generation and a 5 MHz dark count rate (DCR) of a  $3x3 mm^2$  SiPM

#### A. Front-end

The choice of the very front-end architecture is based on the requirements for low-power (mandatory for highly integrated detectors) and low-impedance (due to the use of devices with terminal capacitance up to 320 pF). Given the total capacitance of the device  $C_d$  and being  $R_{in}C_{in}$  the time constant related to the amplifier input node, then  $\tau_{in} = R_{in}(C_d + C_{in})$ , will expectedly create the dominant pole of the amplifier. A high DC input resistance results thereby in a reduced bandwidth and consequently in the degradation of the rising time of the signal.

Moreover, since the current into (or sourced from) the preamplifier is given by Eq. 1,

$$I_{in}(s) = \frac{1}{1 + s\tau_{in}} I_d(s) \tag{1}$$

then the voltage variation caused due to the input resistance is (Eq. 2):

$$R_{in} = \frac{\Delta V_{in}}{\Delta I_{in}} \Leftrightarrow \Delta V_{in} = R_{in} \Delta I_{in} \tag{2}$$

This means that the capacitive coupling between the input channels in highly integrated chips can potentiate important crosstalk due to the voltage bounce  $\Delta V_{in}$  at the firing input. Additionally, due to the fact that the the SiPM gain changes dramatically with the over-voltage applied, it is very undesirable to add a transient to the HV bias due to this bounce of the input node. The solution is to add a stage capable of conveying the input current from a low impedance input into a high impedance output port.

A regulated gate cascode (RGC) input stage ([2]) is used as a current conveyor, which allows to decouple the high input capacitance from the transimpedance transfer function. A small-signal analysis of an RGC input stage can be found in [3]. The regulation loop is implemented with a differential pair with active load. Despite limiting the bandwidth, when compared to a solution with a simple common source, it permits to easily control the input DC baseline in a range of around 500 mV. Such feature is of value if one wants to accurately adjust the gain of the SiPM. Moreover, the adjustment of the input DC baseline. This trimming is done with a 6-bit current mode DAC (cf. Fig 4).



Fig. 4. Abstract of the front-end circuit blocks.

Both hole-collection and electron-collection inputs are available, allowing the choice of different photo-devices. The signal is conveyed into two post-amplifiers with a transimpedance gain up to 4  $K\Omega$ . A high impedance node for the biasing of the post-amplifier is implemented with two back-to-back FETs in cutoff region. The AC coupling proves to be robust to PVT corners and the baseline drift with a 40 kHz maximum charge input stimulus is acceptable. One of the branches implements an optional RC stage, in order to provide a suitable shaped version of the signal for the ToT measurement. Two voltage mode discriminators, with thresholds defined by independent 6-bit DACs, generate the digital signals which are fed to the TDCs.

The total output rms noise voltage is kept below 5mV, allowing to lower the threshold  $V_{th_T}$  to the level of 0.5 p.e. Figure 7 shows the result of a transient noise analysis, when

the input device has a 9  $mm^2$  active area (total terminal capacitance around 320 pF).



Fig. 5. Schematic level transient noise analysis for a single photoelectron, when the device terminal terminal capacitance is 320 pF.

The ToT curve versus the input charge is non-linear with this topology of front-end, and thus the measure of the ToT with known energy point sources is necessary. The obtained characteristic can then be correlated with the curve generated by the internal calibration mechanism, which will allow to perform on-system functional tests of the chip. Such calibration circuitry is illustrated in Fig. 6. A 180 pF capacitor, distributed among the 64 channels, is used for the injection of a programmable current generated at chip level. An appropriate derivative function at channel level roughly emulates the LYSO decay time constant.



Fig. 6. Depiction of the calibration mechanism for the front-end.

#### B. Time-to-digital Conversion

Time and energy information extraction is based on a dualthreshold scheme. A low  $V_{th_T}$  (downto 0.5 p.e.) is used for trigger (on the rising edge), while event validation and ToT information is provided on the falling edge of the signal pulse by a higher  $V_{th_E}$ . Each time stamp consists of a coarse 10bit data, latched from the global gray-encoded coarse time counter, and a 8-bit fine time measurement derived from a 50 ps time binning TDC. This fine time stamp is a direct measure of the phase of the asynchronous pulse with respect to the master clock, derived by a set of time-to-analogue converters (TACs) and an ADC. The principle of operation of such a time multiplication with de-randomization is described in ([4]). An implementation in a deep sub-micron CMOS technology with a power consumption of 1 mW per channel has been reported in ([5], [6]). The inherent time needed to perform the chargeto-voltage conversion is masked by a multi-stage buffer, which de-randomizes the incoming event rate.



Fig. 7. The multi-buffer interpolator-based mixed-mode TDC concept.

The TDCs are calibrated adjusting the charge/discharge current matching on a per-channel basis, equalizing the dynamic range of the conversion time with internal DACs. This process uses a test pulse generated either internally by the chip global controller, or off-chip by the front-end board FPGA. The external LVDS test pulse input is also used for sweeping the trigger phase in respect to the system clock, which is needed for measuring the INL and DNL of the TDC.

The operation of the two independent TDCs is controlled by a dedicated on-channel control block. Besides managing the analogue switching circuitry, it implements the multi-buffering scheme, interfaces with the global controller and performs event validation. The signal produced by a dark count is of the same order of magnitude of the signal produced by a single photon hit, so the discriminators are set to trigger on every spurious event. Three alternative mechanisms for dark pulse rejection were implemented, based on synchronous and asynchronous validation of an higher threshold trigger (energy trigger).

For each valid event, the channel register holds a 50 ps resolution measurement for the two time stamps of interest, packed into a 50-bit word. In order to study the linearity of the TACs, an identification tag of the written buffer is added. The methods used to reject the SiPM dark counts provide also a measure of the DCR and the identification of the situation where an incorrect time measure is latched due to a quasicontemporaneous sequence of a dark pulse and a valid event.

Figure 8 shows a transistor-level simulation of the complete channel, where the input is the result of Monte-Carlo evaluation of the LYSO crystal and SiPM light and e-h generation statistics.

#### III. BACK-END

The TOFPET ASIC is made up of 64 channels, bias and calibration blocks and a global controller. Nominal operation mode uses a 160 MHz clock generated off-chip. Up to two LVDS data output links are available (SDR or DDR), for a total bandwidth from 160 to 640 Mbit/s. An output clock for



Fig. 8. Transistor level simulation of the mixed-mode TDC: the arrival time fluctuation due to the scintillation and SiPM e-h generation statistics are emulated using Geant4 to generate a test vector suitable to be used by a Spectre simulator.





Fig. 10. The 128-channel system-in-a-package: a pad-free edge allows a second (rotated) chip to be abutted. Golden reference generators, bias cells and calibration circuitry is disposed on the periphery of the analogue macroblock (facing channels 0 and 63).

Fig. 9. Chip architecture.

synchronous transmission is available, while a TX training mode can also be used to avoid it. Event data is processed onchip and output in frames with up to 96 events per frame. A raw data mode (safe-mode) is also available, in which event data is output with no arithmetic processing (thereby taking two "slots"). A 10 MHz SPI configuration interface writes and reads the channel configuration, controls calibration procedures and test modes. The clock, reset and coarse time vector is internally propagated by the global controller to each channel, along with the configuration settings.

One edge is pad-free, to allow abutting two twin chips into a 128-channel BGA package. Figure 10 illustrates such disposition, where the top and bottom edges of the SiP accommodate 128 channel inputs. The power network, test probes, bias input and LVDS data, clock and configuration IOs are distributed along left/right pad ring segments.

The compact 128-channel 7x7 mm SiP is packaged into a 17x17x1.70 mm case BGA. Besides its small form, the BGA solution minimizes the inductance of the input signal traces, known to deteriorate the timing performance of a SiPM readout chip and potentiate crosstalk due to inductive coupling between neighboring channels.

# IV. CONCLUSION

We describe the architecture and design concepts of a 64channel readout IC for SiPMs. It performs 50 ps binning timing measurements and energy calculation based on a timeover-threshold method. A dual-threshold strategy is used for hit validation and dark count rejection, while the chip trigger can be set down to 0.5 photoelectrons. The ASIC is optimized for TOF-PET applications with segmented scintillators, but its versatility enables its use in different applications and with a wide range of silicon detectors. The low power consumption of the highly compact readout IC/SiP presented is a key advantage for its integration in dense systems with very stringent cooling requirements.

The 25  $mm^2$  64-channel ASIC is designed in a CMOS 0.13  $\mu m$  technology, and first silicon test results are expected during the 1<sup>st</sup> quarter of 2013.

#### ACKNOWLEDGMENTS

The research leading to these results has received funding from the European Union Seventh Framework Programme [FP7/2007-2013] under Grant Agreement no. 256984.

Ricardo Bugalho is supported by FCT grant SFRH/BD/66008/2009.

The authors wish to thank Richard Wheadon and Luca Toscano for precious help and technical discussions, and Francesco Pennazio for the scintillation statistical data used as input vector for many simulations.

## REFERENCES

- [1] Erika Garutti, EndoTOFPET-US a novel multimodal tool for Endoscopy and positron emission tomography, 2012 IEEE NSS/MIC Conf. Record.
- [2] E. Sackinger, W. Guggenbuhl, A High-Swing, High-Impedance MOS Cascode Circuit, IEEE JSSC, vol. 25, no. 1, pp. 289-298, February 1990
- [3] M D Rolo et al., A low-noise CMOS front-end for TOF-PET, 2011 JINST 6 P09003

- [4] Andrew E. Stevens, Richard P. Van Berg, Jan Van Der Spiegel and Hugh H. Williams, A Time-to-Voltage Converter and Analog Memory for Colliding Beam Detectors, IEEE JSSC vol 24, no 6, 1989
- [5] A. Rivetti et al., A Pixel Front-End ASIC in 0.13μm CMOS for the NA62 Experiment with on Pixel 100 ps Time-to-Digital Converter, 2009 IEEE NSS/MIC Conf. Record.
  [6] A. Rivetti et al., Experimental results from a pixel front-end for the NA62
- [6] A. Rivetti et al., Experimental results from a pixel front-end for the NA62 experiment with on pixel constant fraction discriminator and 100 ps Time to Digital Converter, 2010 IEEE NSS/MIC Conf. Record.