

The 13th International Conference on Position

Sensitive Detectors

Tinstit Ch



St. Catherine's College, Oxford, September 3-8, 2023

# A 4-Gbps serializer circuit in a 180 nm technology for monolithic pixel sensor prototypes developed for the CEPC vertex detector

Xiaoting Li<sup>1,2</sup>, Wei Wei<sup>1,2,\*</sup>, Ying Zhang<sup>1,2</sup>, Tianya Wu<sup>1,2,3</sup> and Ping Yang<sup>4</sup>

<sup>1</sup>Institute of High Energy Physics Chinese Academy of Sciences, <sup>2</sup>State Key Laboratory of Particle Detection and Electronics, <sup>3</sup>School of Physical Sciences, University of Chinese Academy of Sciences, <sup>4</sup>Central China Normal University

lixt@ihep.ac.cn weiw@ihep.ac.cn



## Introduction

- Monolithic CMOS Pixel Sensor (CPS) is one of the promising candidates for the Circular Electron Positron Collider (CEPC) Vertex detector, due to its good performance and trade-off of granularity, readout speed, material budgets and power consumption. A full-scale TaichuPix chip, including a matrix of 512 × 1024 pixels with a size of 25 × 25 μm<sup>2</sup> (the total chip area is about 4.06 cm<sup>2</sup>) is developed to provide a spatial resolution better than 5 μm.
- The chip requires a CMOS Image Sensor technology, and current-stage developments are based on a 180-nm process which limits the high-speed interface design. Considering the requirement of low material budget, the current requirement of the TaichuPix chip requires a power density lower than 200 mW/cm<sup>2</sup> (about 451 mA), and the current of serializer should be less than 100 mA.

*Fig.1. The block diagram of the full-scale TaichuPix chip on an engineering run* 

#### • Overall designs

- The 20:1 serializer (Ser20t1) is composed of a ringoscillating based phase locked loop (RO-PLL) and a 20-to-1 multiplexer with 20 LVDS data input pairs (RXs). The RO-PLL generates a 2-GHz clock to the module MUX20t1 after a duty cycle correction (DCC) circuit. The transmission of high-speed clocks are differential to suppress noise, as well as the reference and test clocks. The simulated total current at 4 Gbps is about 71 mA.
- Similar to the Ser20t1, the 40:1 serializer includes an LC-based PLL with a narrower frequency range and better phase noise performance, and a 40-to-1 multiplexer removed the external data inputs, considering the IO resources limitation. The Ser40t1 can be configured to work at full or half speed mode (4 or 2 Gbps). The simulated total current at 4 Gbps is about 82 mA.

#### RO-PLL

The RO-VCO consists of three pseudo-differential delay cells and a buffer, achieving a free running frequency range (FTR) from 0.34 to 3.12 GHz at the typical corner.
 The 1-MHz-offset phase noise (PN-1MHz) at 2 GHz is about -103 dBc/Hz.
 The loop bandwidth (LBW) can be configured by 3 bits, and the range is between 0.5~2.9 MHz.
 The feedback chain includes a 4-modulus CML divider and a 25-modulus D-flip-flop divider.

- The data frame is 32 bits of 120 Mbps, leading to a total raw data rate of 3.84 Gbps. Based on one of the previous smallscale prototypes (a 32-to-1 serializer), the highest serial data rate is tested to be 3.36 Gbps with a peak-to-peak jitter of about 150 ps and large current consumption. It seems to be not easy to achieve 4-Gbps data rate in the 180-nm technology.
- In addition, the previous 32-bit-data-input serializer is not suitable for the 8B10B encoder. Therefore, two 4-Gbps serializer (20:1 and 40:1) prototypes have been designed and optimized to meet these requirements based on the same process node of 180 nm as the TaichuPix, considering the funding and time costs.

## **Circuit design and performance**



#### Fig.2. Schemes of the serializer designs



#### MUX20t1

The multiplexer consists of three stages. The first stage is four 5:1 units based on a shift-register chain. PCK is a 200-MHz sampling clock of the overall input data. LOAD is a differential 200-MHz clock to load the 20-bit data concurrently to the inside DFFs at the high-

level, and transfer the high-order data at the lowlevel. To save power consumption, four units share one pair of the LOAD signal.

Both the 4:2 and 2:1 stages use the same 2:1
 unit (A) based on a binary-tree structure operating
 at 1-GHz and 2-GHz clocks respectively.



Fig.4. Microphotograph of the prototype design  $(3 \times 3 \text{ mm}^2)$ 

#### MUX40t1

- Different to the MUX20t1, the MUX40t1 includes four stages, which are eight 5:1 units, an 8:4, a 4:1 and a 2:1 unit.
- And the last 2:1 unit uses the different version B, which uses a DCC module instead of S2D in the version A, to correct the clock duty cycle after a long chain (distance) transmission.

#### LC-PLL

- Similar to the RO-PLL, the main difference is the VCO, which is based on an LC-tank (a three-terminal inductor and two symmetric NMOS-varactors).
- Two stages of RC low pass filters are used to suppress noise from the bias circuit.
- FTR: 1.8~2.3 GHz
  PN-1MHz: -118dBc/Hz @ 2GHz
  LBW: 0.22~1.3 MHz



- Tests of the RO-PLL
- Frequency locking range (FLR) : 0.32~2.95 GHz, that is a little lower than the post-layout simulation.
- Three methods were used to characterize the RMS jitter performance of the clock signal,

## **Preliminary tests and results**



× σ-320Mbps



*Fig.7. FLR tests of the LC-PLL* @2.6V **Tests of the Ser40t1** 

- The preliminary tests showed that LC-PLL worked at a power supply higher than 2 V.
- The problem might be in the bias circuits which were also used in the CML driver, resulting abnormally small driving current, and leading to no efficient





- which are TIE tests, random jitter (Rj) and  $\sigma$  values calculated by the SDA II tool of the 16-GHz oscilloscope.
- The best random jitter of the 20-MHz test clock (TestCK) is about 1.1 ps at the lowest LBW of about 0.5 MHz.



Fig.6. Tests of the frequency locking range (up) and RMS jitter (down) of the RO-PLL @1.8V signal output at 1.8-V power supply.

The measured Rj is about **0.7 ps** operating at 2GHz.

The preliminary tests verified that both serializers operated correctly at 4 Gbps (PRBS 2<sup>7</sup> data pattern). Debugging and performance evaluation still needs more tests.

Fig.8. Tests of the transient serial output (up) and eye-diagram (down) of the Ser20t1 @2.6V

## **Conclusion and outlook**

A 4-Gbps serializer prototype circuit has been designed for fast readout of the CEPC vertex detector. The preliminary tests showed low jitter characteristics of the RO-PLL, and the LC-PLL featured better jitter performance regardless of the bias problem temporarily. Both serializer circuits functioned correctly. When the power supply reached to 2.6 V, the total current of the CML driver increased to about 27 mA (the design value is about 31 mA). Then the serial signal was large enough to distinguish. More tests including Failure Analysis will be taken to debug and locate the issues which will be solved in an iteration design.

### Acknowledgements

This work was supported in part by the funding of the National Natural Science Foundation of China (No. 12005245, No. 12075100 and No. 11775244 ). It was also partially funded by the National Key Research and Development Program of China under Grant No. 2018YFA0404302 from the Ministry of Science and Technology, the Scientific Instrument Developing Project of the Chinese Academy of Sciences (No. ZDKYYQ20200007), the Youth Innovation Promotion Association of the Chinese Academy of Sciences (No. Y201905), and the State Key Laboratory of Particle Detection and Electronics (No. SKLPDE-ZZ-201915 and No. SKLPDE-ZZ-202315 ).