# **IpGBT** design, status and plans

Szymon KULIS, CERN on behalf of the lpGBT design team

ACES 2018 - Sixth Common ATLAS CMS Electronics Workshop for LHC Upgrades 24-26 April 2018 CERN, Geneva, Switzerland

# IpGBT Design Team

#### Design team:

- CERN:
- AGH UST:
- KU Leuven:
- UNL FCT:
- SMU Physics:
- SMU Engineering:

#### IP Cores:

- Czech Technical University Prague:
- CERN:

David Porret, Jose Fonseca, Ken Wyllie, Paulo Moreira, Pedro Leitao, Rui Francisco, Sophie Baron, Szymon Kulis, Daniel Hernandez

Marek Idzik, Miroslaw Firlej, Jakub Moroń, Tomasz Fiutowski, Krzysztof Swientek

Bram Faes, Jeffrey Prinzie, Paul Leroux

João Carvalho, Nuno Paulino

Datao Gong, Di Guo, Dongxu Yang, Jingbo Ye, Quan Sun, Wei Zhou

Tao Zhang, Ping Gui

Miroslav Havranek, Tomas Benka Stefano Michelis, Iraklis Kremastiotis, Alessandro Caratelli

# lpGBT: High Speed SerDes

- Data transceiver with fixed and "deterministic" latency both for up and down links.
- Down link:
  - 2.56 Gb/s, FEC12
  - E-links outputs:
    - Up to 16
    - Data rates : 80/160/320 Mb/s
  - E-link clocks outputs:
    - Up to 28
    - Clock frequencies: 40/80/160/320/640/1280 MHz
  - Phase aligned clocks 4 channels
    - 50 ps resolution
    - Clock frequencies: 40/80/160/320/640/1280 MHz
- Up link:
  - 5.12 Gb/s or 10.24 Gb/s, FEC5 or FEC12
  - E-Links inputs:
    - Up to 28
    - Data rates 160 /320 / 640 / 1280 Mb/s
- Power dissipation:
  - Target: ≤ 500 mW @ 5.12 Gb/s
  - Target: ≤ 750 mW @ 10.24 Gb/s
- Compact package
- Radiation tolerance:
  - 200 Mrad
  - SEU robust



#### Agenda

- IpGBT architecture
- Architecture and performance of (selected) blocks
- Design status
- Package / PCB

## lpGBT: Block Diagram (simplified)



ACES, April 2018, Geneva

szymon.kulis@cern.ch

## Foreword: Transmission line, Equalization and Pre-Emphasis

- Bandwidth of transition lines used in HEP is limited because of material budget constrains
- Both pre-emphasis and Equalization are attempts to restore the baseband signal spectrum
- Pre-emphasis (done at the transmitter):
  - Tries to generate a wave shape with an "exaggerated" spectral contents at the frequencies that are most attenuated by the channel [typically the high frequencies]
  - <u>No degradation of the SNR</u>
- Equalization (done at the receiver):
  - Amplifies more the high frequency contents of the spectrum than the low frequency. It is an attempt to achieve a combined response of the channel and equalizer that will approach a channel that has no inter-symbol interference
  - <u>Degradation of the SNR</u>
- Because of the SNR degradation resulting from equalization, the best approach is to <u>combine pre-</u> <u>emphasis and equalization</u>:
  - Enhance the transmitted high frequency content rather than do all the high frequency peaking at the receiver side





#### lpGBT: Down link: Line receiver



### lpGBT: Down link: Line receiver architecture (@2.56Gbps)



# IpGBT: Down link: Eye Opening Monitor architecture

- Goal: Monitor the opening of the received data eye diagram and tune the equalizer's settings
- Provides an "eye diagram picture" by using a "signal-scan" approach
- The scan is performed across the time (x-axis) and across the amplitude (y-axis), yielding a "signal density" per point
  - The input signal [data] is compared with a reference voltage " $V_{of}$ "
  - The comparator's result is sampled by the rising edge of a clock synchronous to the incoming data
  - The sampled result drives a ripple counter to accumulate statistics
  - The counter is enabled for a well defined period.



szymon.kulis@cern.ch

## IpGBT: Down link: Eye Opening Monitor

- Y-axis:
  - 5 bit resistive DAC
  - 31 points, step = ~20 mV (covers from  $V_{DD}/2$  up to  $V_{DD}$ )
- X-axis:
  - Phase interpolator uses 5.12 GHz clock to generate in-phase I and quadrature Q clocks at 2.56GHz
  - 64 points, step = ~6.1 ps in typical





55

50

45

40

25

20

#### IpGBT: clocks manager



szymon.kulis@cern.ch

# Low jitter PLL: architecture matters



- Two PLLs working at 2.5 GHz, same circuits (power dissipation, loop dynamic) except the VCO:
  - LC VCO
  - Ring oscillator VCO
- Pre/post-rad jitter (rms):
  - LC: 0.3 / 1.0 ps
  - RO: 5.6 / 22 ps
- 600 Mrad + Annealing:
  - LC:  $\Delta f < 5\%$  (Passives set the frequency)
  - RO: Δf < 40% (Actives set the frequency)</li>



#### Low jitter PLL: architecture matters

- Heavy ion testing:
  - LET: 3.2 to 69.2 MeV.cm<sup>2</sup>/mg
- LC oscillator displays a significantly higher sensitivity than the ring oscillator!
  - Contrary to expectations!
- SEU Phase jumps (unlock):
  - Ring oscillator: both polarities
  - LC: Mainly positive
- Two-Photon Absorption (TPA) laser tests point to the VARACTOR as the main culprit!
  - Total cross section of the LC-oscillator is 4 10<sup>-5</sup>cm<sup>2</sup> from which 70% is contributed by the varactor area!





szymon.kulis@cern.ch

# Low jitter PLL: topology matters

- A new design prototyped to test the hypothesis:
  - Smaller varactor area
  - Different frequency tuning topology:
    - Grounded vs floating well!





# IpGBT: clocks manager



- Two modes of operation:
  - PLL
  - CDR
    - Locking with external reference
    - Reference-less locking (using data)

szymon.kulis@cern.ch

Generated clocks:

• + inverted clocks

• + triplicated clocks (A/B/C)

•

40MHz / 80MHz / 160MHz / 320MHz / 640MHz / 1.28GHz/ 2.56GHz/ 5.12GHz (In phase!)

#### lpGBT: Down link: eTx - eLink Driver



## lpGBT: Down link: eTx - eLink Driver

#### • Specs

- Data rate: Up to 1.28 Gb/s
- Clock frequency: Up to 1.28 GHz
- Driving current: 1 to 4 mA in 0.5 mA steps
- Receiving end termination: 100 Ω
- Voltage amplitude : 200 mV to 800 mV (DIFF PP amplitude in 100  $\Omega$ )
- Common mode voltage: 600 mV
- Pre-emphasis:
  - Driving current: 1 to 4 mA in 0.5 mA steps
  - Pulse width:
    - Self timed: 120 ps to 960 ps in steps of 120 ps
    - Clock timed: T <sub>bit</sub> / 2
- Design driven by radiation tolerance considerations:
  - The circuit relies on having the resistors setting the current and not the transistors (Poly resistors are insensitive to TID)



#### lpGBT: Down link: eTx – Pre-Emphasis





# lpGBT: Up link: eRx - eLink Receiver



- Specs
  - Data rate: Up to 1.28 Gb/s
  - Optional termination:  $100 \Omega$
  - Optional bias generator for AC coupled signals (VDD/2)
  - Common mode voltage range: 70 mV 1.13V
  - Differential input voltage :140mV 450mV
  - Passive equalizer:
    - Optimized for cables with bandwidths of 448, 299 and 224 MHz
    - Attenuates low frequency signals



- Simulation conditions:
  - 70 MHz channel
  - Data rate: 1.28 Gb/s
  - Signal amplitude: 200 mV
  - Process: TT, V<sub>DD</sub> = 1.2 V, T = 25 °C

## lpGBT: Up link: Phase Alignment





- The LpGBT is the clock source to the front-end modules;
  - All the clocks generated by the LpGBT are synchronous with the LHC machine clock; Thus the LpGBT "knows" exactly the frequency of the incoming data! A CDR circuit is thus not needed for each ePort.
- The phase of the incoming data signals is "unknown" in relation to the internal sampling clock!
- There are up to 28 eLink inputs (potentially) all with random phase offsets
- The solution:
  - "Measure" the phase offset of each eLink input
  - Delay individually each incoming bit stream to phase align it with the internal sampling clock

### lpGBT: Up link: Phase aligner - architecture



Building blocks:

- Reference DLL tunes control voltage for delay elements
- "Open loop" delay lines for data channels

Three modes of operation:

- Automatic phase tracking tracks any phase drifts during operation
- Static phase selection requires the operator to select the proper phase (reduced power consumption)
- Training with learned static phase combines two above mentioned modes

#### lpGBT: Data encoding



szymon.kulis@cern.ch

# lpGBT: Data encoding

- The LpGBT supports the following data rates:
  - Down link: 2.56 Gb/s
  - Up-link: 5.12 / 10.24 Gb/s
- In all cases data is transmitted as a frame composed of:
  - Header
  - The data field
  - A forward error correction field: FEC5 / FEC12
- The data field is scrambled to allow for CDR at no [additional] bandwidth penalty
- Efficiency = # data bits/# frame bits



|                   | Down-link | Up-Link   |       |            |       |
|-------------------|-----------|-----------|-------|------------|-------|
|                   | 2.56 Gb/s | 5.12 Gb/s |       | 10.24 Gb/s |       |
|                   |           | FEC5      | FEC12 | FEC5       | FEC12 |
| Frame [bits]      | 64        | 128       |       | 256        |       |
| Header [bits]     | 4         | 2         |       | 2          |       |
| Data [bits]       | 36        | 116       | 102   | 232        | 204   |
| FEC [bits]        | 24        | 10        | 24    | 20         | 48    |
| Correction [bits] | 12        | 5         | 12    | 10         | 24    |
| Efficiency        | 56%       | 91%       | 80%   | 91%        | 80%   |

#### lpGBT: Up link: Serialzier



# lpGBT: Up link: Serialzier



• The power per serialization level remains constant:

$$P_L = 10.24 \; GHz \times P_0$$

• The serializer power is:

$$P = 10 \times 10.24 \; GHz \times P_0$$

Ten levels are needed to go from 40 MHz to 10.24 GHz

 Although this architecture requires "twice" as much FFs (1023) as a simple shift register based serializer (512), it only requires 10/257 = 3.9% of the power consumption!!!

- Since the clock frequency increases with the serialization level, it is possible to optimize the speed-power at each level (Not possible for the simple shift register)
- The LpGBT does not use the last "resampling" stage at 10.24 GHz.
  - At the output of the last MUX the signal is already at 10.24 Gb/s (Double data rate)
  - Low jitter requires thus the last MUX to be fast and the 5.12 GHz clock to have "perfect" duty-cycle

#### lpGBT: Up link: Line Driver



szymon.kulis@cern.ch

### lpGBT: Up link: Line Driver Topology



# lpGBT: Up link: Line driver simulations (@5.12Gbps)



#### IpGBT: Analog and other peripherals



# lpGBT: Analog peripherals



#### • 10 bit ADC

- Core: fully differential SAR
- 8 channels (single ended or differential)
- Voltage amplifier (x1 .. x32)
- Sampling rate up to ~ 1MSps (limited by the control channel)
- Monitoring of internal signals (like VDD)
- 12 bit voltage DAC
- 8 bit current DAC
  - can be attached to any analog input
  - range: 0-1mA (8bit)
- Temperature sensor

### lpGBT: Other features/blocks

- 28 multi mode deserializers:
  - Data rates: 160Mbps, 320Mbps, 640Mbps, 1280Mbps
- 16 multi mode serializers:
  - Data rates: 80Mbps, 160Mbps, 320Mbps
- 28 independent frequency programmable eLink clocks
  - Frequency: 40MHz 1.28GHz
  - Two phases: 0 and 180 deg
- 4 independent phase/frequency programmable clocks:
  - Frequency: 40 MHz 1.28GHz
  - Phase: 50ps resolution
- 3 independent I2C master channels
- 16 PIO
  - Configurable pull up/down
- Power on reset generator
- Brown out detector
  - Programmable levels : 0.7, 0.75, ... 1.1V
- Watchdog
- SEU calibrator
- Process monitors (ring oscillators)
- Test pattern generators and checkers
- Configuration memory (eFuses)
  - ... ~450 8-bit registers

# **IpGBT:** Status

#### All blocks ready

| Blocks                    | Blocks               |  |
|---------------------------|----------------------|--|
| 10 Gb/s line driver       | ePort Rx DLL         |  |
| 2.5 Gb/s line receiver    | ePort Rx Delay Lines |  |
| HS LVT ELT Library        | ePort TX             |  |
| CML Library               | ePort Clk            |  |
| I/O Library               | IC Channel           |  |
| Bandgap                   | I2C Master           |  |
| Power On Reset            | I2C Slave            |  |
| 10-bit SAR ADC            | Frame Alignment      |  |
| 8-bit ADC                 | Lock Control, PLL    |  |
| <b>Temperature Sensor</b> | Lock Control, CDR    |  |
| Programmable Current      | Auto Reset           |  |
| 5/10 Gb/s Serializer      | Power Up FSM         |  |
| Deserializer              | Watchdog             |  |
| PLL/CDR                   | Timeout              |  |
| CML Divider by 2          | Scrambler            |  |
| Eye Scan                  | FEC Codec            |  |
| Loopback Multiplexer      | Self Test BERT Tx    |  |
| Fine Phase Shifter        | Descrambler          |  |
| Coarse Phase Shifter      | Frame Deinterleaver  |  |
| ePort Deserializer        | Self Test BERT Rx    |  |

#### Integration in progress

- Integration of high speed blocks in the full custom macro
- Top level PNR : advanced (almost DRC and LVS clean)
- Whole chip timing, power verification

#### lpGBT layout



#### lpGBT IR drop



#### IpGBT: Package

- Small Footprint BGA package:
  - Size: 9 mm x 9 mm x 1.25 mm
  - Fine Pitch: 0.5 mm
  - Pin count: 289 (17 x 17)
- Designed by STATS ChipPAC
- Routing of high speed signals optimized and simulated
  - Very small loss @ 10GHz
  - models used for line driver simulations
- Status: ready for production





#### IpGBT: Test Board and test firmware



#### PCB

- 2 High Count FMC connectors
  - Access to ALL signals (79x2 differential, 4xHigh speed, 42 CMOS, 10 Analog)
  - Compatible with VC707
- Low jitter PLL for timing reference
- Software configurable LDO's with current monitoring
- Full custom socked already produced
- Status: final routing and simulation of high speed nets



#### Firmware

- IP Bus based
- High speed part already tested
- Slow control interfaces ready (I2C)

### LpGBT Project Schedule

#### • Q2 2018:

- Currently working towards the ASIC tapeout
  - MPW 2018 June
- Package procurement and engineering completed
- Test system currently under development
- Q4 2018
  - Prototypes available (~300 ASICs)
- Q4 2018 Q3 2019:
  - Prototype functional testing
  - Radiation qualification
  - Production testing development
- Q4 2019:
  - Engineering run
- Q3 2020
  - Engineering ASICs (~40k) available to the users
- Q3 2021:
  - Production completed (~100k ASICs)