

# 3D Integration:

# New Opportunities for Speed, Power and Performance

Robert Patti, CTO

rpatti@tezzaron.com

## Why We Scale?



>180nm 130nm 90nm 65nm 45nm 28nm 22nm 16nm

## The Effect of 2.5/3D on Devices



## 3D Stacking Approaches

# Chip Level Ziptronix • Xan3D Vertical Circuits Amkor: 4S CSP (MCP) **Irvine Sensors: Stacked Flash**



Samsung: Stacked Flash



#### Device Level

- Stanford
- Besang

#### **Matrix: Vertical TFT**





#### Wafer Level

- Infineon/IBM
- RPI
- ZyCube

#### **Tezzaron**





## Wafer Level Stacking Approaches



# Market Drivers: 3D

| Driver                         | Functionality                        | Technical Parameter # 1 | Technical<br>Parameter # 2 | Value Indicator      |
|--------------------------------|--------------------------------------|-------------------------|----------------------------|----------------------|
| Stacked<br>NAND Flash          | Cell Phones Hard Drives Flash Drives | Memory density          |                            | High packing density |
| Micro-<br>processor+<br>Memory | Workstations                         | Latency<br>bandwidth    | Power                      | Execution time       |
| Memory                         | Multiple                             | Density                 | Latency                    | Varies               |
| Image<br>Sensor                | Cell Phones Cameras Automotive       | Quantum<br>Efficiency   | Number of pixels           | Image Quality        |



There is a lot of 3D today



16Gb NAND flash (2Gx8 chips), Wide Bus DRAM



Intel

CPU + memory

OKI **CMOS Sensor** 





4 die 65nm interposer



560µ



TSV

connections







## Raytheon/Ziptronix

PIN Detector Device

#### IBM

RF Silicon Circuit Board / TSV Logic & Analog

> Toshiba 3D NAND

## Through-Silicon Via (TSV)

- Via First
- Via Last
- Via at Front end (FEOL)
- Via at Mid line (MOL?)
- Via at Back end (BEOL)



**RPI** 



## **Span of 3D Integration**

## Packaging



3D-ICs 100-1,000,000/sqmm Wafer Fab



IBM/Samsung 1000-10M Interconnects/device

**IBM** 



## 1s/sqmm

#### Peripheral I/O

- Flash, DRAM
- CMOS Sensors



... 100,000,000s/sqmm

Transistor to Transistor

Ultimate goal

## TSV Pitch ≠ Area ÷ Number of TSVs

- TSV pitch issue example
  - 1024 bit busses require a lot of space with larger TSVs
  - They connect to the heart and most dense area of processing elements
  - The 45nm bus pitch is ~100nm; TSV pitch is >100x greater
  - The big TSV pitch means TOF errors and at least 3 repeater



## TSV Pitch ≠ Area ÷ Number of TSVs

- TSV pitch issue example
  - 1024 bit busses require a lot of space with larger TSVs
  - They connect to the heart and most dense area of processing elements
  - The 45nm bus pitch is ~100nm; TSV pitch is >100x greater
  - The big TSV pitch means TOF errors and at least 3 repeater stages



## 3D Interconnect Characteristics

|                              | SuperContact <sup>TM</sup> I  200mm  Via First, FEOL | SuperContact <sup>TM</sup> III 200mm Via First, FEOL | SuperContact <sup>TM</sup> IV  200mm  Via First, FEOL | Interposer<br>TSV            | Bond Points         | Die to<br>Wafer |
|------------------------------|------------------------------------------------------|------------------------------------------------------|-------------------------------------------------------|------------------------------|---------------------|-----------------|
| Size<br>LX W X D<br>Material | 1.2 μ Χ 1.2 μ<br>Χ 6.0μ<br>W in Bulk                 | 0.85 μ X 0.85 μ<br>X 10μ<br>W in Bulk                | 0.60 μ X 0.60 μ<br>X 2μ<br>W in SOI                   | 10 μ X 10 μ<br>X 100 μ<br>Cu | 1.7 μ X 1.7 μ<br>Cu | 3 μ Χ 3 μ<br>Cu |
| Minimum<br>Pitch             | <2.5 μ                                               | 1.75 μ                                               | 1.2 μ                                                 | 30/120 μ                     | 2.4 μ               | 5 μ             |
| Feedthrough<br>Capacitance   | 2-3fF                                                | 3fF                                                  | 0.2fF                                                 | 250fF                        | <b>&lt;</b> <       | <25fF           |
| Series<br>Resistance         | <1.5 Ω                                               | ⊰Ω                                                   | <1.75 Ω                                               | <0.5 Ω                       | <b>~</b>            | <               |

Small fine grain TSVs are fundamental to 3D enablement

## A Closer Look at Wafer-Level Stacking



## Next, Stack a Second Wafer & Thin:



## **Stacking Process Sequential Picture**



#### **High Precision Alignment**

Misalign=0.3um



# Then, Stack a Third Wafer:



## Finally, Flip, Thin & Pad Out:



17

This is the completed stack!











# Honeywell 0.6um SOI TSV





### 120KTSVs













## DRAM wants 2 different processes!

| Bit cells                               | Low leakage -slow refresh -low power -low GIDL                  | High Vt Devices Vneg Well Thick Oxide          |
|-----------------------------------------|-----------------------------------------------------------------|------------------------------------------------|
| Sense Amps Word line drivers Device I/O | High speed -better sensitivity -better bandwidth -lower voltage | Low Vt Devices Copper interconnect Thin Oxides |

## "Dis-Integrated" 3D Memory

Memory Layers from DRAM fab

Controller Layer from high speed logic fab



# **Octopus Stack**



DRAM Control/Logic



2 Layer Stacked Device (SEM)



**DRAM Memory Cells** 

## Gen4 "Dis-Integrated" 3D Memory

DRAM layers 4Xnm node

2 million vertical connections per lay per die

I/O layer contains: I/O, interface logic and — R&R control CPU.

65nm node

Better yielding than 2D equivalent!

Controller layer contains: senseamps, CAMs, row/column decodes and test engines. 40nm node

## 3D DRAMs

### Octopus I

- 1-4Gb
- 16 Ports x 128bits (each way)
- @1GHz
  - CWL=0 CRL=2 SDR format
  - 5ns closed page access to first data (aligned)
  - 12ns full cycle memory time
  - >2Tb/s data transfer rate
- Max clk=1.6GHz
- Internally ECC protected, Dynamic self-repair, Post attach repair
- 115C die full function operating temperature

## Octopus II

- 4-64Gb
- 64-256 Ports x 64bits (each way)
- @1GHz
  - 5-7ns closed page access to first data (aligned)
  - 12ns full cycle memory time
  - >16Tb/s data transfer rate
  - 4096 banks
  - 2+2pJ/bit













## The Industry Issue



Need 50x bandwidth improvement.

Need 10x better cost model than embedded memory.

- >To continue to increase CPU performance, exponential bandwidth growth required.
- More than 200 CPU cycles of delay to memory results in cycle for cycle CPU stalls.
  - ➤ 16 to 64 Mbytes per thread required to hide CPU memory system accesses.
- ➤ No current extension of existing IC technology can address requirements.
- ➤ Memory I/O power is running away.

## **Main Memory Power Cliff**

DDR3 ~40mW per pin 1024 Data pins →40W 4096 Data pins →160W Die on Wafer ~24uW per pin







## The "Killer" App: Split-Die

Embedded Performance with far



## Die to Wafer With BCB Template





RPI Effort under Dr. James Lu

- •KGD
- •2um alignment / 5um pitch limit
- •Cu-Cu thermo compression bonding
- Multilayer capability







# Hyper-Integration 5-9 layer stacks



5.5B

3.1B

Trans. Count

3B

## **Current Memory Split-Die Projects**



## R8051/Memory

5X Performance 1/10<sup>th</sup> Power

























RF, Imaging, Processing, Analog









#### **Tezzaron Dummy Chip C2C Assembly**



Memory die



C2C sample



X-ray inspection indicated no significant solder voids



X-section of good micro bump



CSCAN showed no underfill voids (UF: Namics 8443-14)

#### Near End-of-Line



39

# New Apps – New Architectures







#### 3D Place & Route



#### 3D LVS using QuartzLVS from Magma

#### Key features

- LVS each of the 2D designs as well as the 3D interconnections between them in a single run
- Driven by a 3D "tech file" that specifies the number and order of layers, interconnect material, etc
- TSV aware LVS extraction
- Full debug environment to analyze any LVS mismatch



```
▼ Titan: basoc SEDFCNQD1 lay 1 (read)

                                                                                     Quartz DRC / LVS / DFM - /magma/mojave build2/jcwhite/dac/basoc top/lvs/titan/runLVSDir
File View Tools Help
                             Elle Settings View Select Create Edit Shape Query P
                                                                                    (5 No co 1 No co 1
                                                                                      Overview Equates Hierarchy Final Details Messages
                                                                                      Schematic Net Details: SEDFCNOD1
                                                                                                                                                      Layout Net Details: SEDFCNOD1
                                                                                       Number of schematic nets limited to 25
                                                                                                                                                      Number of layout nets limited to
 Categories
Everything
 $$via12_10ns
 $$via12 11ns
 $$via12 12ns
 $$via12_13X1_116TX1s
 $$via12_13ns
 $$via12_14X1_117TX1s
 $$via12_16X2_1d7TZs
 $$via12 1X2 110Tbs
                                                                                               Quartz LVS - Schematic
 $$via12_1X2_112Tbs
                                                                                                                                                      Doc Basec_top lay SEDFCNQD1 lay
INFO: Executing catch ( mojave_current_set (mojave) )
INFO: Executing mojave_zoom 4.831 0.069 6.224 1.72
```

```
# 3D LVS Tech file
WAFER: 1

LAYOUT TOP BLOCK: lvslayer1_1
SCHEMATIC TOP BLOCK: lvslayer1
GDSII FILE: lvslayer1_1.gds
SCHEMATIC NETLIST: lvslayer1.sp
INTERFACE UP METAL: 1;0
INTERFACE UP TEXT: 1;101
...
INTERFACE:
LAYOUT TOP BLOCK: lvstop
SCHEMATIC TOP BLOCK: lvstop
GDSII FILE: lvstop_ALL.gds
SCHEMATIC NETLIST: lvstop.sp
BOND OUT METAL: 5;0
BOND OUT TEXT: 5;101
```

### **Challenges**

- Tools
  - Partitioning tools
  - 3D P&R
- Access
- Heat
- Testing
  - IEEE 1500
  - IEEE 1149
- Standards
  - Die level
    - JEDEC JC-11 Wide bus memory
  - Foundry interface





### 3D Key to Enable Next Gen



Simplified Technology

Real Reuse

**Optimized Process** 

**IP** isolation

#### Tezzaron/Novati 3D Technologies



- "Volume" 2.5D and 3D Manufacturing in 2013
- Interposers
- Future interposers with
  - High K Caps
  - Photonics
  - Passives
  - Power transistors
- Wholly owned Tezzaron subsidiary
- Cu-Cu, DBI®, Oxide, IM 3D assembly

## 2.5/3D Design Enablement

- Complete 3D PDK 8<sup>th</sup> Release
  - GF 130nm
  - Synopsys, Hspice, Cadence, MicroMagic 3D physical editor
  - Calibre 3D DRC/LVS
  - Artisan standard cell libraries
- MOSIS, CMP, and CMC MPW support
  - 130nm, coming soon 65nm
  - Silicon Workbench
- Honeywell 150nm SOI
- NEOL TSV insertion
- $40 \rightarrow 28$ nm 3D logic
- Silicon interposers, active, photonics
- Organic interposers
- >100 devices in process
- >500 users



## **Summary**

- 3D has numerous and vast opportunities!!
  - New design approaches
  - New ways of thinking
  - Best of class integration of
    - Memory
    - Logic
    - RF
    - MEMS

Sensors

Computing MEMS

Communications

