VIPIC-Large 3D ASIC

Fermilab: Farah Fahim, Grzegorz Deptuch, Alpana Shenai
AGH-UST: Piotr Maj, Piotr Kmon, Paweł Grybos, Robert Szczygieł
BNL: D. Peter Siddons, Abdul Rumaiz, Anthony Kuczewski, Joseph Mead
ANL: Rebecca Bradford, John Weizeorick
Presentation Outline

- Camera development Goal
- Analog and Digital ASIC pixel
  - Analog pixel functionality
  - Digital pixel functionality
- Digital sub-chip
- Digital Readout
  - Readout modes,
  - Data transfer and data integrity
- Digital Floor-planning and assembly
- Conclusions and future work
VIPIC: Single Module 3D Camera Vertically Integrated Photon Counting IC

Ultimate GOAL: To build a single module 3D camera with minimal gap

- Contains a 6x6 array of large ASICs each with ~37000 pixels
- (1.3Mpixel camera module)
- LTCC hosts FPGA based processing units
- Approximately 1 FPGA per chip, with single pixel gap in the assembly
- High yield requires identification of known good dies before stacking

VIPIC-Large BES detector project - BNL- Fermilab- Argonne
Front End Electronics 2016
Analog & Digital Pixel
VIPIC: Analog Pixel functionality (single repeatable indivisible unit)

- Preamplifier with leakage current compensation (upto 5nA)
- Shaper (band pass filter- peaking time 150ns)
- 2 comparators to create a window discriminator
- 7 bit trimming DACs for threshold offset correction
Preamplifier

Post layout simulation – input pulse Qin=2200el, different gain settings
VIPIC: Digital Pixel functionality

- Hit processor (double discriminator logic, ensures hits are counted only once within the correct frame)
- 2 (7bit) Gray code counters for dead-time less readout

- Asynchronous, Clockless pseudo digital block with feedback paths
- Compact custom digital library
- Compact custom layout
- Created an abstract and characterized as a core cell (10x std cell)
Hit processor

- Hit processor is a single stage pipeline
- Hit processor employs a Seitz arbiter to avoid a race condition between a ‘hit’ signal and change of frame ‘frameClk’
- The Hit processor also sends a readRequest to the priority encoder and receives the selectPixel
7b Gray code counters

- Counters are not cycled if pixel is not ‘hit’ to save power (low occupancy)
- If ‘hit’ arrives after frame change, but before pixel is read. It is counted by setting the 1st bit of the counter
Digital sub-chip
High speed output serializer for off-chip data transfer at 400Mbps
LVDS output - 1 per sub-chip (36 per chip)
The configuration register contains a long shift register chain (21,511-bit). It is the serial communication for programming of the ASIC.

Each pixel contains 21-bits out of which 19-bits needs to be sent to the analog pixel across the bonding interface.

An additional 7-bits is used for global programming of the ASIC, such as readout mode selection etc.

Once the shift register is programmed, its contents are copied to a shadow register.

Powers ON to default state

**Configuration register**

<table>
<thead>
<tr>
<th>No.</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>Gain control b0</td>
</tr>
<tr>
<td>1</td>
<td>Gain control b1</td>
</tr>
<tr>
<td>2</td>
<td>Gain control b2</td>
</tr>
<tr>
<td>3</td>
<td>trim DAC1 b0</td>
</tr>
<tr>
<td>4</td>
<td>trim DAC1 b1</td>
</tr>
<tr>
<td>5</td>
<td>trim DAC1 b2</td>
</tr>
<tr>
<td>6</td>
<td>trim DAC1 b3</td>
</tr>
<tr>
<td>7</td>
<td>trim DAC1 b4</td>
</tr>
<tr>
<td>8</td>
<td>trim DAC1 b5</td>
</tr>
<tr>
<td>9</td>
<td>trim DAC1 b6</td>
</tr>
<tr>
<td>10</td>
<td>trim DAC2 b0</td>
</tr>
<tr>
<td>11</td>
<td>trim DAC2 b1</td>
</tr>
<tr>
<td>12</td>
<td>trim DAC2 b2</td>
</tr>
<tr>
<td>13</td>
<td>trim DAC2 b3</td>
</tr>
<tr>
<td>14</td>
<td>trim DAC2 b4</td>
</tr>
<tr>
<td>15</td>
<td>trim DAC2 b5</td>
</tr>
<tr>
<td>16</td>
<td>trim DAC2 b6</td>
</tr>
<tr>
<td>17</td>
<td>calibrateB</td>
</tr>
<tr>
<td>18</td>
<td>reset</td>
</tr>
<tr>
<td>19</td>
<td>set</td>
</tr>
<tr>
<td>20</td>
<td>btr</td>
</tr>
</tbody>
</table>

Preamplifier Gain Control b0
Preamplifier Gain Control b1
Preamplifier Gain Control b2
Trimming DAC 1 Control b0
Trimming DAC 1 Control b1
Trimming DAC 1 Control b2
Trimming DAC 1 Control b3
Trimming DAC 1 Control b4
Trimming DAC 1 Control b5
Trimming DAC 1 Control b6
Trimming DAC 2 Control b0
Trimming DAC 2 Control b1
Trimming DAC 2 Control b2
Trimming DAC 2 Control b3
Trimming DAC 2 Control b4
Trimming DAC 2 Control b5
Trimming DAC 2 Control b6
Analog Pixel in Calibration Mode
Digital reset bit
Digital set bit
Range select
Priority Encoder

- No global read strobe signal
- readOutControl generated by output serializer, uses the same path as the readRequest in the opposite direction to enable a pixel
- Pixel with lowest priority is selected when no valid data exists. Its counter value is ‘0’, thus differentiating from an actual hit
- No pullup or pulldown
Digital Readout Modes: Changing output data packets
High speed communication: Readout modes

Option to flexibly program the number of bits sent out of the ASIC to optimize bandwidth usage
## High speed communication: Readout modes

<table>
<thead>
<tr>
<th>MODE</th>
<th>Description</th>
<th>No. of bits per data packet (time to readout)</th>
<th>Data output rate per VIPIC-L</th>
</tr>
</thead>
<tbody>
<tr>
<td>000</td>
<td>Default: Sparsification mode with 3 bit synchronization word</td>
<td>20 (50ns)</td>
<td>720M word/s</td>
</tr>
<tr>
<td>001</td>
<td>Sparsification mode without synchronization word</td>
<td>17 (42.5ns)</td>
<td>847M word/s</td>
</tr>
<tr>
<td>010</td>
<td>Sparsification mode without synchronization word with reduced counter bits (only bit [1:0] for fast frames)</td>
<td>12 (30ns)</td>
<td>1.2G word/s</td>
</tr>
<tr>
<td>011</td>
<td>Imaging mode full readout</td>
<td>7 (17.5ns)</td>
<td>56k frames/s</td>
</tr>
<tr>
<td>100</td>
<td>Synchronization mode: 40'b000001111100010001011111000001110111010</td>
<td>20 (50ns)</td>
<td>720M word/s</td>
</tr>
<tr>
<td>101</td>
<td>Sparsification mode without synchronization word with reduced counter bits (only bit [3:0] for fast frames)</td>
<td>14 (35ns)</td>
<td>1.03G word/s</td>
</tr>
<tr>
<td>110</td>
<td>Imaging mode full readout with reduced counter bits</td>
<td>5 (12.5ns)</td>
<td>78k frames/s</td>
</tr>
<tr>
<td>111</td>
<td>Synchronization mode: 40'b01010101011110001111010111010100001100010</td>
<td>20 (50ns)</td>
<td>720M word/s</td>
</tr>
</tbody>
</table>
Sparsification Mode with 3 bit synchronisation

- Pixel matrix divided into 2 halves with 512 pixels each
- Single stage pipeline, to allow sufficient time for data to settle
- `readOutControl` pulses generated by serializer

```
locationPointer[0:40]
```

```
readOutControlBot  100ns
```

```
dataOutputReg[0:19]
```

```
addressBot[0:8]  Not valid
readOutControlTop
```

```
dataOutputReg[20:39]
```

```
addressTop[0:8]  Not valid
```
Sparsification Mode with reduced counter bits
Managing data transfer and maintaining data integrity
Readout strategy for synchronous readout with user defined slow frame change

frameClk: rising edge used to indicate change of frame (external slow clock).
readoutControl: is used to enable data transfer from a pixel to the serializer register, which is generated by the output serializer. This signal is interleaved between the two 512 pixel banks and is alternately broadcasted to the top pixel matrix and then to the bottom pixel matrix for pixel selection. The readoutControl pulse width is 2.5 ns corresponding to the serializerClk of 400 MHz. The time between the pulses of readoutControl is set by the readout mode depending on the number of output bits.
selectPixel(n): allows for the contents of the counter to be enabled and the pixel address to be established by the priority encoder. Effectively, this signal is the readoutControl signal as seen by the pixel, controlled by the priority encoder. The negative edge of the readoutControl enables the pixel and the positive edge disables it. The next negative edge of readoutControl selects a new pixel, next in the priority list established by the priority encoder.
loadSerializer: is used to latch data alternately from the top and bottom pixel matrix. It is issued just before the next pixel is selected to ensure that the data has sufficient time to settle before being latched.
Potential Readout errors

Case II: frameClk: change of frame, before the readoutControl pulse

- Problem timing case: data from pixel ‘n’ of the current priority list is not latched and data from pixel ‘a’ from the new priority list is not yet settled
- readoutControl: signal generated by output serializer to enable data transfer from pixel to the shift register
- selectPixel (n): last pixel of current frame
- selectPixel (a): first pixel of new frame
- loadSerializer
  - Corrupted Output latched @ posedge
  - Pixel disabled and counters reset @ posedge of frameClk: Before latching data (data-loss)
  - Pixel latched before data and address is settled (wrong data)
Corrected Readout strategy

Case III: frameClk: change of frame corrected

readoutControlTop

readoutControlBot

frameClk: repositioned after the following edge of original readoutControlTop

readoutControlTop: corrected

selectPixel (n): last pixel of current frame

Pixel 'n' enabled and priority encoder generates address @ negedge

selectPixel (a): first pixel of next frame

loadSerializerTop

pixel 'n' Output latched @ posedge

no pixel selected 'known data pattern' latched @ posedge

pixel 'a' Output latched @ posedge

readoutControlBot: corrected

Pulse width readjusted

selectPixelBot (n): last pixel of current frame

Pixel 'n' enabled and priority encoder generates address @ negedge

selectPixelBot (a): first pixel of new frame

loadSerializerBot

pixel 'n' Output latched @ posedge

no pixel selected 'known data pattern' latched @ posedge

pixel 'a' Output latched @ posedge
Floor planning and assembly
VIPIC : Assembly Specification

- Pixel size: 65µm x 65µm: Separate digital and analog tier
- Chip size: 1.248cm x 1.248cm (+ 5µm periphery)
- Array size: 192 x 192 (36,864)
- Each Chip is sub-divided to 6x6 array of sub-chips

**SUBCHIP**
- Sub chip size: 32 x 32 array of pixels (~ 2.1mm x 2.1mm)
- Sub chip contains – independent i/o and biasing
- Two sub-chips combined to share certain common signals
- Total no. of PADS for 2 sub-chips: 5 x 8 = 40 (distributed across the back of the ASIC, unlike traditional I/O PADs)

**Communication I/O per ASIC**
- Total no of pads: 40 x 18 = 720

**PADs: (not in the periphery as in a traditional ASIC)**
- Horizontal pitch: 520µm
- Vertical pitch: 416µm
VIPIC: Analog- Digital interconnectivity

- 36 bonding interface connections in each pixel
- 20 signals for setup from Digital to Analog
- 2 strobe signals for selecting individual pixels
- 2 outputs of comparator from Analog to digital
- The bonding interface connects Metal 9 (Analog ASIC pixel) – Metal 9 (Digital ASIC pixel):
  - Fixed location in a pixel (5µm pitch) 13 x 13 array
Digital sub-chip assembly

- Output Serializer: 400MHz serializerClk – independent place and route
- Sub-chip assembly: ~150K gates; 20 PADs + 4 mini PADs + LVDS driver & receiver, 38,000 pins (connections to analog pixels)
- Unordered placement of digitalPixels
- Multiple clock domains (configClk (21511b configuration register) + frameClk (frame changes))
Digital double sub-chip view

I/O resource sharing per double sub-chip (40)

PADs: Horizontal pitch: 520um ; Vertical pitch: 416um

Communication I/O per ASIC (1.24cm x 1.24cm): 40 x 18 = 720

> 300K transistors in each sub-chip (>10M transistors per Digital IC)
### Top Level assembly

<table>
<thead>
<tr>
<th>2080µm</th>
<th>2080µm</th>
<th>Sub-chip 32 x 32 pixel</th>
<th>Sub-chip 32 x 32 pixel</th>
<th>Sub-chip 32 x 32 pixel</th>
<th>Sub-chip 32 x 32 pixel</th>
</tr>
</thead>
<tbody>
<tr>
<td>VDD/VSS grid</td>
<td>LVDS Driver</td>
<td>SERIALIZER</td>
<td>LVDS Receiver</td>
<td>I/O pad</td>
<td>Pixel bonding interface (mirror image of the analog pixel)</td>
</tr>
<tr>
<td>Custom digital blocks placed asymmetrically across the sub-chip</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

- **120 million transistors**
- **1.33 million interconnections between analog and digital tier**
- **1.248 cm x 1.248 cm**
- **5 µm periphery around the entire ASIC**
Conclusions and Future work

- Process has been established for making large area 3D cameras using multi-tier ROICs with complex in-pixel processing
- Separating the Analog and Digital functionality to separate tiers reduces noise coupling through substrate
- Low temperature direct bonding to sensor also reduces noise (due to reduction in input capacitance)
- First 3D prototype (64 x 64 pixel array) was demonstrated in 2012
- Submitted of Analog and Digital ASIC for manufacture March 2016
- Camera module to be delivered in early 2017.
Questions?