Zynq® UltraScale+™ MPSoC and Road to Versal

CERN – June 12\textsuperscript{th} 2019
Xilinx Product Families: A Broad Portfolio

SoC Portfolio

Cost-Optimized → Mid-Range

ZYNQ
Zynq-7000
SoC
System Optimized for Power & Cost

Mid-Range → High-End

ZYNQ
UltraSCALE+
Heterogeneous MPSoC
Right Engines for the Right Tasks

FPGA & 3D IC Portfolio

Cost-Optimized

SPARTAN
Lowest Power & Cost

Mid-Range

KINTEX
Price/Performance/Watt

High-End

VIRTEX
Performance & Capacity
Agenda

- Zynq-7000 Momentum
- Zynq® UltraScale+™ MPSoC Product
- Product Tables
- Road to Versal
Zynq-7000 SoC Momentum

Leading Customers

- Cisco
- Samsung
- ZTE
- National Instruments
- Siemens
- Alcatel-Lucent
- Bosch
- Raytheon
- Ericsson

Industry Awards

- 3200+ Design Wins
- 200+ Ecosystem Partners
- 15,000+ Dev Kits Sold
- 60+ Types of Dev Kits Available

Other Awards:

- ACE Awards
- Electron d'Or
- Zynq Choice
- Elektra 2011 Winner
- EDN Innovation Award
SoC - Zynq®-7000

Single-Core
766MHz
Artix-7 FPGA Fabric

Dual-Core
800MHz
Artix-7 FPGA Fabric

Dual Core
1GHz
Kintex-7 FPGA Fabric

Integrated Memory Mapped Peripherals
  • e.g. USB2.0, GigE

Integrated Analog
  • Dual multi-channel 12-bit ADC
  • Up to 1Msps
  • Temp & Voltage sensors

Extensive IP Portfolio
  • Standardized AXI4 interfaces
  • Enables peripheral expansion
  • Includes software drivers

High Bandwidth Memory
  • L1/L2 CPU Caches
  • Dedicated On-Chip Memory (OCM)
  • DDR3, DDR2, LPDDR2 w/ ECC

Tightly Coupled Domains
  • 3000+ PS/PL interconnects
  • Low Latency
  • Up to 100Gb/s of bandwidth
Agenda

> Zynq-7000 Momentum
> Zynq® UltraScale+™ MPSoC Product
> Product Tables
> Road to Versal
The First Multiprocessing SoC (MPSoC)

- Heterogeneous Processing Architecture
- 64-bit Processing with Terabyte Address Space
- Domain-Focused Acceleration Engines
Why Heterogeneous?

Non-Critical Tasks

Critical Tasks

Linux

General Purpose Processor

Network Interface

Motor Control
(Real-Time Response)

Interrupt

Interrupt

Compute-Intensive

Non-Critical Tasks

Critical Tasks

Real-Time Processing

RTOS

Motor Control
(Real-Time Response)

Interrupt

Interrupt

Compute-Intensive

Non-Critical Tasks

Critical Tasks

Real-Time Processing

RTOS

Motor Control
(Real-Time Response)

Interrupt

Interrupt

Compute-Intensive
## Application Processing System: ARM Cortex-A53

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>ARMv8-A architecture, Multicore Cortex-A53 up to 1.5 GHz</strong></td>
<td>• 64-bit increases compute capability while maintaining 32-bit compatibility&lt;br&gt;• ARM’s most power-efficient A5x APU &amp; most widely used 64-bit processor&lt;br&gt;• 1 terabyte physical address space&lt;br&gt;• 2.7X performance/watt (DMIPS) vs. predecessor (processor comparison only)</td>
</tr>
<tr>
<td><strong>NEON Technology</strong></td>
<td>SIMD engine accelerates multimedia, signal &amp; image processing algorithms</td>
</tr>
<tr>
<td><strong>Floating-Point Unit (FPU)</strong></td>
<td>• Hardware support for FP operations in half-, single- and double-precision&lt;br&gt;• IEEE754-2008 compliant (current Floating Point standard)</td>
</tr>
<tr>
<td><strong>Hardware Virtualization</strong></td>
<td>Enables multiple SW environments &amp; apps simultaneous access to system resources</td>
</tr>
</tbody>
</table>

### Application Processing Unit

![Application Processing Unit Diagram](image)

<table>
<thead>
<tr>
<th>A53</th>
<th>A9</th>
<th>A53</th>
</tr>
</thead>
<tbody>
<tr>
<td><img src="image" alt="Performance" /> 1</td>
<td><img src="image" alt="Performance" /> 2</td>
<td><img src="image" alt="Performance" /> 3</td>
</tr>
<tr>
<td><img src="image" alt="Power" /> 4</td>
<td><img src="image" alt="Power" /> 5</td>
<td><img src="image" alt="Power" /> 6</td>
</tr>
</tbody>
</table>

2.7X DMIPS

Negligible Power Increase
Real-Time Processing System: ARM Cortex-R5

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
</table>
| ARMv7-R Architecture, up to 600MHz | • Flagship ARM series for deterministic processing for critical real-time operation  

• Offloads APU to perform compute-intensive tasks, reducing overall system power  

• Supports Real-Time Operating Systems (RTOS) or Bare Metal |

| Dual-Core for Multi-Mode Operation | • Split-Mode with each real-time core operating autonomously  

• Lock-Step Mode for fault tolerance and fault detection, doubles TCM to 256KB |

| 128KB Memory with ECC            | • Tightly coupled with processor for deterministic and low-latency response  

• Ideal for critical code structures such as interrupt service routines |

| Safety Certifiable               | • Industry-proven to meet safety-critical standards  

• e.g., IEC 61508 (industrial) and IEC 26262 (automotive) |

### Diagram

- **Real-Time Processing Unit**
  - ARM Cortex™-R5
  - Vector Floating Point Unit
  - Memory Protection Unit
  - 128 KB TCM w/ECC
  - 32 KB I-Cache w/ECC
  - 32 KB D-Cache w/ECC
  - GIC

- **Split Mode**
  - (Autonomous Operation)

- **Single OS**
  - Real-Time Processor
  - Real-Time Processor
  - 256KB TCM
Different Classes of Graphics Processing Units

- High Performance Graphics
  - Gaming, 3D Vision, & 4K Display

- General Purpose GPU
  - Data Center Acceleration and High Performance Computing

- Power Optimized Graphics
  - Embedded Graphics

- Hardware Acceleration

- Power-Optimized GPU for Embedded Graphics
- Programmable Logic for Accelerated Compute

OpenCL

Massive Parallelism
## ARM-Based Graphics Processor

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
</table>
| **ARM Mali™-400 MP2 up to 667MHz** | • Most power-optimized ARM GPU with Full HD support (1080p)  
  • Ideal for 2D vector graphics and 3D graphics (e.g., HMI, waveform processing)  
  • Supports open standards, e.g., OpenGL ES 1.1 & 2.0 |
| **Native Embedded Linux Support** | Out-of-the-box drivers and libraries for graphics support                                                                                   |
| **Dual Pixel Processors**        | • Up to 1.3 GPixel/s fill rate for smoother transition and frame rate  
  • Up to 20 GFLOPS shader rate for complex 3D scenes                                                                                   |
| **Optimized Memory Interface**   | Tightly coupled w/memory controller for efficient communication with DisplayPort controller                                                                 |

### Full HD (1920x1080) GLmark2 Benchmark

![ARM Mali™-400 MP2 Diagram](image)

- **Geometry Processor**
- **Pixel Processor**
- **Memory Management Unit**
- **64 KB L2 Cache**

- **GPU**
- **APU**

- **Performance (fps)**
- **Power (mW)**

Similar Power

50x
## Integrated H.264 / H.265 Video Codec Engine

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Integrated Video Codec Unit</td>
<td>• Up to 4K UHD (60 fps) or 8Kx4K (15 fps)</td>
</tr>
<tr>
<td></td>
<td>• Up to 8 simultaneous streams</td>
</tr>
<tr>
<td></td>
<td>• Flexible memory topology to enable scalable system performance</td>
</tr>
<tr>
<td>Power Management, Performance Monitoring</td>
<td>• Clock gating (dynamic savings), power gating (static/dynamic savings)</td>
</tr>
<tr>
<td></td>
<td>• Measure task execution time, bandwidth, and latency for fast design optimization</td>
</tr>
</tbody>
</table>

![Diagram showing integrated video codec engine components: Camera, Video Stream Across Ethernet, Memory Controller, Encoder, Decoder, DisplayPort, Ethernet.]
# Memory Solution within Processing System

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Dedicated DDR Memory Controller</td>
<td>Integrated in processing system for lower power usage and reduced latency</td>
</tr>
<tr>
<td>6 AXI Ports For Shared System Access</td>
<td>Multi-ported controller enables PS and PL shared access to common memory</td>
</tr>
<tr>
<td>32/64-bit Configurable Widths w/ECC</td>
<td>Supports varying data widths from processing engines</td>
</tr>
<tr>
<td>256KB On-Chip Memory (OCM) w/ECC</td>
<td>• Low latency memory decreases cost for additional external memory</td>
</tr>
<tr>
<td></td>
<td>• Shareable by Cortex-A53s, Cortex-R5s, and programmable logic</td>
</tr>
<tr>
<td>Tightly Coupled Memory (TCM)</td>
<td>Low-latency, deterministic memory access for Cortex-R5s in functional safety applications</td>
</tr>
</tbody>
</table>

## Supported Interfaces in Processing System

<table>
<thead>
<tr>
<th>Interface</th>
<th>Max Bandwidth (Mb/s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>DDR4</td>
<td>2400*</td>
</tr>
<tr>
<td>LPDDR4</td>
<td>2400</td>
</tr>
<tr>
<td>DDR3</td>
<td>2133</td>
</tr>
<tr>
<td>DDR3L</td>
<td>1866</td>
</tr>
<tr>
<td>LPDDR3</td>
<td>1800</td>
</tr>
</tbody>
</table>

*DDR4 up to 2,667Mb/s in Programmable Logic

*Figure showing supported interfaces in processing system with bandwidth details for DDR4, LPDDR4, DDR3, DDR3L, and LPDDR3.*
## Massive Interconnect Bandwidth for Hardware Acceleration & Coherency

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
</table>
| High Performance AXI4 ports           | • Twelve 128-bit ports based on the open interface standard  
• 6,000 interconnects between processing system & fabric to avoid multi-chip I/O limitations                                                                                                         |
| Accelerated Coherency Port (ACP)      | Single port for direct access into APU snoop control unit and L2 cache                                                                                                                                 |
| Coherent AXI Interfaces               | Two ports for coherent memory access between a DMA device and the A53 (CCI)                                                                                                                              |
| AXI Coherency Extensions (ACE)        | Single bi-directional port for coherent memory access between a coherent master & A53 (CCI)                                                                                                             |

![Diagram of AXI4 Ports (6,000 Interconnects) and Coherency Extensions](image-url)
### Integrated Peripherals within Processing System

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Integrated in the Processing System</td>
<td>• Eliminate external components to lower BOM cost</td>
</tr>
<tr>
<td></td>
<td>• Lower latency access to processing elements at less power</td>
</tr>
<tr>
<td></td>
<td>• Instant-on operation (e.g., PCIe)</td>
</tr>
<tr>
<td>Integrated 6G Transceivers</td>
<td>• Direct access to processing elements for immediate operation</td>
</tr>
<tr>
<td>Display Controller</td>
<td>• DisplayPort for up to 4K x 2K @ 30fps</td>
</tr>
<tr>
<td></td>
<td>• Enables Alpha blending of graphics and video</td>
</tr>
<tr>
<td>Xilinx Peripheral Protection Units</td>
<td>• Ensures secure data access on all peripherals</td>
</tr>
</tbody>
</table>

#### General Protocol Support
- CAN
- SPI
- UART
- SD/eMMC
- USB

#### High-Speed Protocol Support
- USB 3.0
- USB 2.0
- DisplayPort
- Gigabit Ethernet
### Platform & Power Management

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
</table>
| Multiple Power Domains                       | • Granular architecture enabling block-level power management  
• Eliminate static power of unused blocks                                               |
| Fault-Tolerant Platform Management Unit (PMU) | • Automatic memory BIST, logic BIST, and clock management on start-up (ROM)  
• Enables extensible runtime power management (RAM)  
• Systematic power coordination between processing elements for reliable shutdown & resume |
## Flexible Device Boot and Device Configuration
### Configuration & Security Unit (CSU)

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Flexible Boot Architecture</td>
<td>• Primary boot from Quad SPI Flash, NAND Flash, SD 3.0, or eMMC</td>
</tr>
<tr>
<td>Dedicated Subsystem</td>
<td>• Fault tolerant device boot: secure and non-secure</td>
</tr>
<tr>
<td></td>
<td>• Dedicated decryption (AES-256) &amp; authentication (4096-bit RSA key, SHA3 hash functions) engines</td>
</tr>
<tr>
<td></td>
<td>• Native fault-tolerant multi-boot, including golden image support</td>
</tr>
<tr>
<td></td>
<td>• Independent anti-tamper protection</td>
</tr>
<tr>
<td>Post Boot Security Engines</td>
<td>• Access to hardened encryption engines post boot</td>
</tr>
</tbody>
</table>

### Example of Boot Sequence (e.g., Real-Time Processor First)

1. **Platform Management**
   - Release CSU
2. **Config/Security Unit**
   - Load FSBL
3. **Real-Time Processor**
   - FSBL
4. **Application Processor**
   - ATF
5. **Programmable Logic**
   - U-Boot
6. **RTOS / Bare Metal**
   - OS
7. **Tamper Monitoring**
8. **Power Monitoring**
9. **Time**

**Bitstream**
## Dedicated Engines for Security, Safety, Reliability

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Device-Level Secure Processing</strong></td>
<td>• Information Assurance, Anti-Tamper, Trust</td>
</tr>
<tr>
<td></td>
<td>• Multi-layered Authentication for Secure System Boot</td>
</tr>
<tr>
<td></td>
<td>• Key &amp; Vault Management</td>
</tr>
<tr>
<td><strong>Safety Capabilities and Supported Standards</strong></td>
<td>• IEC61508 &amp; ISO26262 Functional Safety Standards</td>
</tr>
<tr>
<td></td>
<td>• Redundancy, Diversity and Lock-step</td>
</tr>
<tr>
<td></td>
<td>• Layered Partitioning: Core / Infrastructure / Peripherals</td>
</tr>
<tr>
<td><strong>High Reliability Features</strong></td>
<td>• Error Detection &amp; Mitigation</td>
</tr>
<tr>
<td></td>
<td>• Subsystem Isolation &amp; Protection</td>
</tr>
</tbody>
</table>
## Integrated Direct-RF Data Converters in Zynq UltraScale+ RFSoC

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>4GSPs or 2GSPS ADCs with 12-bit Resolution</td>
<td>• 4GHz of direct-RF bandwidth</td>
</tr>
<tr>
<td>6.4GSPS DACs with 14-bit Resolution</td>
<td></td>
</tr>
<tr>
<td>RF-Sampling with Full DSP Subsystem</td>
<td>• RF-design in programmable digital domain, reducing external analog components</td>
</tr>
<tr>
<td></td>
<td>• Full Digital Down-Conversion (DDC) and Up-Conversion (DUC)</td>
</tr>
<tr>
<td></td>
<td>• Optionally bypass subsystem to programmable logic for custom mixing &amp; filtering</td>
</tr>
<tr>
<td>Based on 16nm FinFET+</td>
<td>• Optimal performance-per-watt and at least two process nodes ahead of latest generation discrete components</td>
</tr>
<tr>
<td>Dedicated Communication-Grade PLLs</td>
<td>• Leverage lower frequency external clock to drive high speed converters</td>
</tr>
<tr>
<td>Multi-Band Support</td>
<td>• Enable flexible carrier aggregation through a single RF signal chain</td>
</tr>
</tbody>
</table>

---

![Diagram](Diagram.png)

- Full DSP Subsystem on 16nm
- RF Sampling
- Multi-Band Support
- Programmable Logic
- DSP-Based Mixing & Filtering
- 4GSPS ADCs
- 6.4GSPS DACs
- Optionally bypass subsystem
- Internal PLL
- Low Frequency External Clock
- Band 1
- Band 2
More than Just Silicon

- Run-Time Software
- System Software
- Reference Designs
- Design Tools
- Emulation & Development Kits
- Rapid Development
Run Time Software

Begin Application Development with Validated OS’s
Determine the Right Processor for your Application

OS Ecosystem

- Linux: Mentor, Wind River
- Xen Hypervisor
- Android
- Baremetal
- eSol eT-kernel
- FreeRTOS
- GreenHills – INTEGRITY
- LynxOS7, LynxSecure
- Mentor Hypervisor, Nucleus
- Micrium - uC/OS-II & III
- QNX
- Sciopta
- Sysgo – PikeOS
- Wind River VxWorks7/Rocket

APU
RPU
Micro-Blaze

Hypervisor Support

Open Source / No Fee
Safety Certifiable
Cross-Processor OS Support
Commercial

High-Level OSs
Stand-alone Drivers & Libraries
RTOSs

OS
Apps
Kernel

OS
Apps
Kernel

OS
Apps
Kernel

APU
MicroBlaze™ Processor

RPU

Hypervisors
## System Software

### Out-of-the-Box Firmware, Drivers, Frameworks

<table>
<thead>
<tr>
<th>Feature</th>
<th>Details</th>
</tr>
</thead>
<tbody>
<tr>
<td>First Stage Boot Loader (FSBL), U-Boot</td>
<td>Out-of-the-box boot loaders</td>
</tr>
<tr>
<td>Security Firmware</td>
<td>Decryption, authentication, and secure boot</td>
</tr>
<tr>
<td>ARM® Trusted Firmware</td>
<td>OpenSource firmware to boot secure OS, leverage ARMv8-A virtualization features</td>
</tr>
<tr>
<td>Software Test Libraries</td>
<td>Leverages BIST for RPU, provided for functional safety</td>
</tr>
<tr>
<td>Power Management Framework</td>
<td>Standard APIs for power management</td>
</tr>
<tr>
<td>Inter-Processor Framework (OpenAMP)</td>
<td>Framework for inter-OS &amp; inter-processor management &amp; communication (APU &amp; RPU)</td>
</tr>
</tbody>
</table>

**Processing System**

- APU
- Memory
- GPU
- Inter-Processor Framework
- Security Firmware
- RPU
- CSU
- PMU
- System Control
- Peripherals

**Built in Self-Test & Boot Software Test Libraries**
### Embedded Software Development Tools

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Eclipse-Based IDE</td>
<td>Familiar SW development environment – Xilinx Software Design Kit (SDK)</td>
</tr>
<tr>
<td>Linaro GCC Tool Chain</td>
<td>Industry standard compiler tool chain for Embedded Linux &amp; Bare Metal</td>
</tr>
<tr>
<td>Multi-Core Debug</td>
<td>Debug &amp; cross triggering for Cortex-A53s, Cortex-R5s, and MicroBlaze™ Processor</td>
</tr>
<tr>
<td>Performance Profiling &amp; Analysis</td>
<td>Analyze interfaces across processing and programmable logic domains</td>
</tr>
</tbody>
</table>
| Ecosystem Development Tools   | • Broad support for 3rd party dev tools & debug, e.g., ARM DS-5, Lauterbach Trace-32  
                                | • Designers use their preferred development & debug environment                                                                           |

Xilinx Software Development Kit (SDK) for SW Dev and Project, Build, & Tool Chain Management
Design Challenge with SoCs
Multiple Tools, Multiple Disciplines

<table>
<thead>
<tr>
<th>Task</th>
<th>Challenges</th>
</tr>
</thead>
</table>
| SW/HW Partitioning       | • Demands system-level expertise  
                          | • Manual process with little tool support                                    |
| SW/HW Connectivity        | • Error-prone manual hardware IP integration                                 |
                          | • Complexity of data mover IP design                                        |
| Design Flow Management    | • Iteration when partition doesn’t meet requirement                          |
                          | • Modification with multiple, inter-dependent groups                        |

“Original Source”

Define HW/SW Partition

Met Requirements?

No

Yes

Software

Algorithm Development

OS / Driver Dev

Application SW Dev

Hardware

RTL Development

IP Integration

Verification

ASSP or CPU

FPGA
## SDx Dev Environment

**ASSP-Like Environment for HW and SW Design**

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Pure SW Dev Environment</strong></td>
<td>• From C/C++/OpenCL to fully functional SoC without any HW code</td>
</tr>
<tr>
<td></td>
<td>• Familiar, Eclipse-based, SW-centric design environment</td>
</tr>
<tr>
<td><strong>System Level Profiling</strong></td>
<td>• PS/PL performance, area, data transfer estimates in minutes</td>
</tr>
<tr>
<td></td>
<td>• Quickly identify optimal total system architecture</td>
</tr>
<tr>
<td><strong>Single-Click SW/HW Partitioning</strong></td>
<td>• Quickly explore different system connectivity in C/C++</td>
</tr>
<tr>
<td></td>
<td>• Find data mover and PS-PL interface for optimal dataflow</td>
</tr>
<tr>
<td><strong>System Optimizing Compiler</strong></td>
<td>• Automated function acceleration in logic fabric</td>
</tr>
<tr>
<td></td>
<td>• Generates both ARM software and FPGA bitstream</td>
</tr>
<tr>
<td></td>
<td>• Examine throughput, latency and area tradeoffs</td>
</tr>
</tbody>
</table>

---

**Diagram:**

- Pure SW Dev Environment (Eclipse-Based)
- System-Level Profiling
- Single-Click SW-HW Partitioning
- System Optimizing Compiler
  - ARM Code
  - Connectivity Optimization
  - Function Acceleration

---

“Original Source”

SDSoC Environment
Zynq UltraScale+ Block Diagram

Processing System

Application Processing Unit
- ARM Cortex™-A53
- NEON™ Floating Point Unit
- Memory Management Unit
- Embedded Trace Macrocell
- 32 KB I-Cache w/Parity
- 32 KB D-Cache w/ECC

Real-Time Processing Unit
- ARM Cortex™-R5
- Vector Floating Point Unit
- Memory Protection Unit
- 128 KB TCM w/ECC
- 32 KB I-Cache w/ECC
- 32 KB D-Cache w/ECC

Platform Management Unit
- System Management
- Power Management
- Functional Safety

Memory
- DDR4/3/3L, LPDDR4/3 ECC Support
- 256 KB OCM with ECC
- 1 MB L2 w/ECC

Graphics Processing Unit
- ARM Mali™-400 MP2
- Memory Management Unit
- 64 KB L2 Cache

Configuration and Security Unit
- AES Decryption, Authentication, Secure Boot

System Functions
- Multichannel DMA
- Timers, WDT, Resets, Clocking, & Debug
- Voltage/Temp Monitor
- TrustZone

High-Speed Connectivity (Up to 6Gb/s)
- DisplayPort
- USB 3.0
- SATA 3.1
- PCIe 1.0 / 2.0

General Connectivity
- GigE
- USB 2.0
- CAN
- UART
- SPI
- Quad SPI NOR
- NAND
- SD/eMMC

Programmable Logic

Storage & Signal Processing
- Block RAM
- UltraRAM
- DSP

General-purpose I/O
- High-Performance I/O
- High Density (Low Power) I/O

High-Speed Connectivity
- 16G Transceivers
- 33G Transceivers
- Interlaken
- 100G EMAC
- PCIe® Gen4

Video Codec
- H.265/H.264
- AMS
Agenda

> Zynq-7000 Momentum

> Zynq® UltraScale+™ MPSoC Product

> Product Tables

> Road to Versal
Extending Scalability Across the Zynq® Portfolio

- **High-End**
  - EG Devices
    - Quad-Core ARM Cortex-A53
    - Dual-Core ARM Cortex-R5
    - ARM Mali™-400 MP2
    - 16nm FinFET+ Logic
  - EV Devices
    - Quad-Core ARM Cortex-A53
    - Dual-Core ARM Cortex-R5
    - H.264/H.265 Video Codec
    - 16nm FinFET+ Logic

- **Mid-Range**
  - Zynq-7000
    - Dual-core ARM Cortex-A9
    - 28nm Kintex-7 FPGA
  - Zynq-7000S
    - Dual-core ARM Cortex-A9
    - 28nm Artix-7 FPGA

- **Cost-Optimized**
  - Zynq-7000S
    - Single-Core ARM® Cortex™-A9
    - 28nm Artix-7 FPGA
  - Zynq-7000
    - Dual-core ARM Cortex-A9
    - 28nm Kintex-7 FPGA

System Integration
Agenda

- Zynq-7000 Momentum
- Zynq® UltraScale+™ MPSoC Product
- Product Tables
- Road to Versal
Introducing the World’s First ACAP

- Heterogeneous Acceleration
  - For Any Application
  - For Any Developer
Adaptable Architecture Connected Via NoC

> 7nm Technology

> Scalar Engines
  >> Arm® Cortex™-A72 APU
  >> Arm Cortex-R5 RPU

> Adaptable Engines
  >> CLBs
  >> Internal Memory

> Intelligent Engines
  >> AI Engine
  >> DSP Engine

> Connectivity
  >> PCIe w/CCIX
  >> Ethernet
  >> DDR Memory Controllers
  >> Transceivers
  >> I/O

> Platform Resources
  >> Network-On-Chip
  >> Platform Management Controller
Versal™ Prime Series – Device View

Network-on-Chip (NoC)

CCIX & PCIe

ARM Cortex-A72
Application Processor

ARM Cortex-R5
Real-Time Processor

Platform Management

Programmable Logic

1GHz DSP & Embedded Memory

Serial Transceivers

Hard Networking IP

Hard DDR Controllers

Transceivers

Hard IP

Hard DDR Controllers

NoC

DDR Controller

NoC

DDR Controller

CPM

Processing System & PMC

DDR Controller

XPIO

XPIO

ARM Cortex

ARM Cortex

Application Processor

Real-Time Processor

DSP & Embedded Memory

Transceivers
Versal™ AI Core Series – Device View

Network-on-Chip (NoC)

CCIX & PCIe

ARM Cortex-A72 Application Processor

ARM Cortex-R5 Real-Time Processor

Platform Management

AI Engine Array

Programmable Logic

1GHz DSP & Embedded Memory

Serial Transceivers

Hard Networking IP

Hard DDR Controllers

Transceivers

HDIO

NoC

CPM

Processing System & PMC

DDR Controller

NoC

XPI/O
Technology Leader and Established SoC Vendor

The First Heterogeneous MPSoC

Zynq UltraScale+ Delivers Unprecedented Levels of Integration

Comprehensive Embedded Software and Tools Solution

Next generation of Embedded Processor on 7nm Versal Families
Adaptable.

Intelligent.
# Zynq® UltraScale+™ MPSoCs: CG Devices

## Smarter Control

<table>
<thead>
<tr>
<th>Device Name(1)</th>
<th>ZU2CG</th>
<th>ZU3CG</th>
<th>ZU4CG</th>
<th>ZU5CG</th>
<th>ZU6CG</th>
<th>ZU7CG</th>
<th>ZU9CG</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Application Processor Unit</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Processor Core</td>
<td>Dual-core ARM® Cortex™-A53 MPCore™ up to 1.3GHz</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Memory w/ECC</td>
<td>L1 Cache 32KB I / D per core, L2 Cache 1MB, on-chip Memory 256KB</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Real-Time Processor Unit</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Processor Core</td>
<td>Dual-core ARM Cortex-R5 MPCore™ up to 533MHz</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Memory w/ECC</td>
<td>L1 Cache 32KB I / D per core, Tightly Coupled Memory 128KB per core</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>External Memory</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Dynamic Memory Interface</td>
<td>x32/x64: DDR4, LPDDR4, DDR3, DDR3L, LPDDR3 with ECC</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Static Memory Interfaces</td>
<td>NAND, 2x Quad-SPI</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Connectivity</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>High-Speed Connectivity</td>
<td>PCIe® Gen2 x4, 2x USB3.0, SATA 3.1, DisplayPort, 4x Tri-mode Gigabit Ethernet</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>General Connectivity</td>
<td>2xUSB 2.0, 2x SD/SDIO, 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIO</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Integrated Block Functionality</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power Management</td>
<td>Full / Low / PL / Battery Power Domains</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Security</td>
<td>RSA, AES, and SHA</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>AMS - System Monitor</td>
<td>10-bit, 1MSPS - Temperature, Voltage, and Current Monitor</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>PS to PL Interface</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>System Logic Cells (K)</td>
<td>103</td>
<td>154</td>
<td>192</td>
<td>256</td>
<td>469</td>
<td>504</td>
<td>600</td>
</tr>
<tr>
<td>CLB Flip-Flops (K)</td>
<td>94</td>
<td>141</td>
<td>176</td>
<td>234</td>
<td>429</td>
<td>461</td>
<td>548</td>
</tr>
<tr>
<td>CLB LUTs (K)</td>
<td>47</td>
<td>71</td>
<td>88</td>
<td>117</td>
<td>215</td>
<td>230</td>
<td>274</td>
</tr>
<tr>
<td>Max. Distributed RAM (Mb)</td>
<td>1.2</td>
<td>1.8</td>
<td>2.6</td>
<td>3.5</td>
<td>6.9</td>
<td>6.2</td>
<td>8.8</td>
</tr>
<tr>
<td>Total Block RAM (Mb)</td>
<td>5.3</td>
<td>7.6</td>
<td>4.5</td>
<td>5.1</td>
<td>25.1</td>
<td>11.0</td>
<td>32.1</td>
</tr>
<tr>
<td>UltraRAM (Mb)</td>
<td>-</td>
<td>-</td>
<td>14.0</td>
<td>-</td>
<td>18.0</td>
<td>-</td>
<td>27.0</td>
</tr>
<tr>
<td>Clocking</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Clock Management Tiles (CMTs)</td>
<td>3</td>
<td>3</td>
<td>4</td>
<td>4</td>
<td>4</td>
<td>8</td>
<td>4</td>
</tr>
<tr>
<td>DSP Slices</td>
<td>240</td>
<td>360</td>
<td>728</td>
<td>1,056</td>
<td>1,973</td>
<td>1,728</td>
<td>2,520</td>
</tr>
<tr>
<td>PCI Express® Gen 3x16 / Gen4x8</td>
<td>-</td>
<td>-</td>
<td>2</td>
<td>2</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>150G Interlaken</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>100G Ethernet MAC/PCS w/RS-FEC</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>AMS - System Monitor</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td><strong>Speed Grades</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Extended(2)</td>
<td>-1 -L2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Industrial</td>
<td>-1 -L2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**Notes:**

1. For full part number details, see the Ordering Information section in DS891, Zynq UltraScale+ MPSoC Overview.
2. 2L (1) = 0°C to 110°C. For more details, see the Ordering Information section in DS891, Zynq UltraScale+ MPSoC Overview.
# Zynq® UltraScale+™ MPSoCs: EG Devices

<table>
<thead>
<tr>
<th>Device Name(1)</th>
<th>ZU2EG</th>
<th>ZU3EG</th>
<th>ZU4EG</th>
<th>ZU5EG</th>
<th>ZU7EG</th>
<th>ZU6EG</th>
<th>ZU9EG</th>
<th>ZU15EG</th>
<th>ZU11EG</th>
<th>ZU17EG</th>
<th>ZU19EG</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Application Processor</strong></td>
<td>Processor Core</td>
<td>Quad-core ARM® Cortex™-A53 MPCore™ up to 1.5GHz</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Memory w/ECC</td>
<td>L1 Cache 32KB I / D per core, L2 Cache 1MB, on-chip Memory 256KB</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Real-Time Processor</strong></td>
<td>Processor Core</td>
<td>Dual-core ARM Cortex-R5 MPCore™ up to 600MHz</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Memory w/ECC</td>
<td>L1 Cache 32KB I / D per core, Tightly Coupled Memory 128KB per core</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Graphical &amp; Video Acceleration</strong></td>
<td>Graphics Processing Unit</td>
<td>Mali™-400 MP2 up to 667MHz</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Memory</td>
<td>L2 Cache 64KB</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>External Memory</strong></td>
<td>Dynamic Memory Interface</td>
<td>x32/x64: DDR4, LPDDR4, DDR3, DDR3L, LPDDR3 with ECC</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Static Memory Interfaces</td>
<td>NAND, 2x Quad-SPI</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Connectivity</strong></td>
<td>High-Speed Connectivity</td>
<td>PCIe® Gen2 x4, 2x USB3.0, SATA 3.1, DisplayPort, 4x Tri-mode Gigabit Ethernet</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>General Connectivity</td>
<td>2xUSB 2.0, 2x SD/SDIO, 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIO</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Integrated Block Functionality</strong></td>
<td>Power Management</td>
<td>Full / Low / PL / Battery Power Domains</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Security</td>
<td>RSA, AES, and SHA</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>AMS - System Monitor</td>
<td>10-bit, 1MSPS - Temperature, Voltage, and Current Monitor</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><em>PS to PL Interface</em></td>
<td>12 x 32/64/128b AXI Ports</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Programmable Logic (PL)</strong></td>
<td>System Logic Cells (K)</td>
<td>103 154 192 256 504</td>
<td>469 600 747 653 926 1,143</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>CLB Flip-Flops (K)</td>
<td>94 141 176 234 461</td>
<td>429 548 682 597 847 1,045</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>CLB LUTs (K)</td>
<td>47 71 88 117 215</td>
<td>215 274 341 299 423 523</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Max. Distributed RAM (Mb)</td>
<td>1.2 1.8 2.6 3.5 6.2</td>
<td>6.9 8.8 11.3 9.1 8.0 9.8</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Total Block RAM (Mb)</td>
<td>5.3 7.6 4.5 5.1 11.0</td>
<td>25.1 32.1 26.2 21.1 28.0 34.6</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>UltraRAM (Mb)</td>
<td>- - 14.0 18.0 27.0</td>
<td>- - - 31.5 22.5 28.7 36.0</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Clock Management Tiles (CMTs)</td>
<td>3 3 4 4 8</td>
<td>4 4 4 8 11 11</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>DSP Slices</td>
<td>240 360 728 1,056 1,728</td>
<td>1,973 2,520 3,528 2,928 1,590 1,968</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Integrated IP</strong></td>
<td>PCI Express® Gen 3x16 / Gen4x8</td>
<td>- - 2 2 2</td>
<td>- - - 4 4 5</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>150G Interlaken</td>
<td>- - - - -</td>
<td>- - - 2 2 4</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>100G Ethernet MAC/PCS w/RS-FOC</td>
<td>- - - - -</td>
<td>- - - 1 2 4</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>AMS - System Monitor</td>
<td>1 1 1 1 1</td>
<td>1 1 1 1 1 1</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Speed Grades</strong></td>
<td>Extended(2)</td>
<td>-1 -2L -3</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Industrial</td>
<td>-1 -1L -2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Notes:
1. For full part number details, see the Ordering Information section in [DS891](#), Zynq UltraScale+ MPSoC Overview.
2. -2LE (Tj = 0°C to 110°C). For more details, see the Ordering Information section in [DS891](#), Zynq UltraScale+ MPSoC Overview.
Zynq® UltraScale+™ MPSoCs: EV Devices

<table>
<thead>
<tr>
<th>Device Name</th>
<th>ZU4EV</th>
<th>ZU5EV</th>
<th>ZU7EV</th>
</tr>
</thead>
<tbody>
<tr>
<td>Processor Core</td>
<td>Quad-core ARM® Cortex™-A53 MPCore™ up to 1.5GHz</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Memory w/ECC</td>
<td>L1 Cache 32KB I / D per core, L2 Cache 1MB, on-chip Memory 256KB</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Processor Core</td>
<td>Dual-core ARM Cortex-R5 MPCore™ up to 600MHz</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Memory w/ECC</td>
<td>L1 Cache 32KB I / D per core, Tightly Coupled Memory 128KB per core</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Graphics Processing Unit</td>
<td>Mali™-400 MP2 up to 667MHz</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Memory</td>
<td>L2 Cache 64KB</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Dynamic Memory Interface</td>
<td>x32/x64: DDR4, LPDDR4, DDR3, DDR3L, LPDDR3 with ECC</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Static Memory Interfaces</td>
<td>NAND, 2x Quad-SPI</td>
<td></td>
<td></td>
</tr>
<tr>
<td>PCIe Gen2 x4, 2x USB3.0, SATA 3.1, DisplayPort, 4x Tri-mode Gigabit Ethernet</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>2x USB 2.0, 2x SD/SDIO, 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIO</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power Management</td>
<td>Full / Low / PL / Battery Power Domains</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Security</td>
<td>RSA, AES, and SHA</td>
<td></td>
<td></td>
</tr>
<tr>
<td>AMS - System Monitor</td>
<td>10-bit, 1MSPS - Temperature, Voltage, and Current Monitor</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>12 x 32/64/128b AXI Ports</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Programmable Logic (PL)</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>System Logic Cells (K)</td>
<td>192</td>
</tr>
<tr>
<td>CLB Flip-Flops (K)</td>
<td>176</td>
</tr>
<tr>
<td>CLB LUTs (K)</td>
<td>88</td>
</tr>
<tr>
<td>Max. Distributed RAM (Mb)</td>
<td>2.6</td>
</tr>
<tr>
<td>Total Block RAM (Mb)</td>
<td>4.5</td>
</tr>
<tr>
<td>UltraRAM (Mb)</td>
<td>14.0</td>
</tr>
<tr>
<td>Clock Management Tiles (CMTs)</td>
<td>4</td>
</tr>
<tr>
<td>DSP Slices</td>
<td>728</td>
</tr>
<tr>
<td>Video Codec Unit (VCU)</td>
<td>1</td>
</tr>
<tr>
<td>PCI Express® Gen 3x16 / Gen4x8</td>
<td>2</td>
</tr>
<tr>
<td>15G Interlaken</td>
<td>-</td>
</tr>
<tr>
<td>100G Ethernet MAC/PCS w/RS-FEC</td>
<td>-</td>
</tr>
<tr>
<td>AMS - System Monitor</td>
<td>1</td>
</tr>
</tbody>
</table>

Notes:
1. For full part number details, see the Ordering Information section in DS891, Zynq UltraScale+ MPSoC Overview.
2. -LE T1 = -25°C to 110°C. For more details, see the Ordering Information section in DS891, Zynq UltraScale+ MPSoC Overview.
<table>
<thead>
<tr>
<th>Device Name</th>
<th>ZU21DR</th>
<th>ZU25DR</th>
<th>ZU27DR</th>
<th>ZU28DR</th>
<th>ZU29DR</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Application Processor</strong></td>
<td>Processor Core</td>
<td>Quad-core ARM® Cortex™-A53 MPCore™ up to 1.5GHz</td>
<td>Quad-core ARM® Cortex™-A53 MPCore™ up to 1.5GHz</td>
<td>Dual-core ARM Cortex-R5 MPCore up to 533MHz</td>
<td>Dual-core ARM Cortex-R5 MPCore up to 533MHz</td>
</tr>
<tr>
<td><strong>Unit</strong></td>
<td>Memory w/ECC</td>
<td>L1 Cache 32KB / D per core, L2 Cache 1MB, on-chip Memory 256KB</td>
<td>L1 Cache 32KB / D per core, L2 Cache 1MB, on-chip Memory 256KB</td>
<td>L1 Cache 32KB / D per core, Tightly Coupled Memory 128KB per core</td>
<td>L1 Cache 32KB / D per core, Tightly Coupled Memory 128KB per core</td>
</tr>
<tr>
<td><strong>Real-Time Processor</strong></td>
<td>Processor Core</td>
<td>Dual-core ARM Cortex-R5 MPCore up to 533MHz</td>
<td>Dual-core ARM Cortex-R5 MPCore up to 533MHz</td>
<td>Dual-core ARM Cortex-R5 MPCore up to 533MHz</td>
<td>Dual-core ARM Cortex-R5 MPCore up to 533MHz</td>
</tr>
<tr>
<td><strong>Unit</strong></td>
<td>Memory w/ECC</td>
<td>L1 Cache 32KB / D per core, Tightly Coupled Memory 128KB per core</td>
<td>L1 Cache 32KB / D per core, Tightly Coupled Memory 128KB per core</td>
<td>L1 Cache 32KB / D per core, Tightly Coupled Memory 128KB per core</td>
<td>L1 Cache 32KB / D per core, Tightly Coupled Memory 128KB per core</td>
</tr>
<tr>
<td><strong>External Memory</strong></td>
<td>Dynamic Memory Interface</td>
<td>x32/x64: DDR4, LPDDR4, DDR3, DDR3L, LPDDR3 with ECC</td>
<td>x32/x64: DDR4, LPDDR4, DDR3, DDR3L, LPDDR3 with ECC</td>
<td>x32/x64: DDR4, LPDDR4, DDR3, DDR3L, LPDDR3 with ECC</td>
<td>x32/x64: DDR4, LPDDR4, DDR3, DDR3L, LPDDR3 with ECC</td>
</tr>
<tr>
<td><strong>Connectivity</strong></td>
<td>Static Memory Interfaces</td>
<td>NAND, 2x Quad-SPI</td>
<td>NAND, 2x Quad-SPI</td>
<td>NAND, 2x Quad-SPI</td>
<td>NAND, 2x Quad-SPI</td>
</tr>
<tr>
<td><strong>Real-Time Processor</strong></td>
<td>High-Speed Connectivity</td>
<td>PCIe® Gen2 x4, 2x USB3.0, SATA 3.1, DisplayPort, 4x Tri-mode Gigabit Ethernet</td>
<td>PCIe® Gen2 x4, 2x USB3.0, SATA 3.1, DisplayPort, 4x Tri-mode Gigabit Ethernet</td>
<td>PCIe® Gen2 x4, 2x USB3.0, SATA 3.1, DisplayPort, 4x Tri-mode Gigabit Ethernet</td>
<td>PCIe® Gen2 x4, 2x USB3.0, SATA 3.1, DisplayPort, 4x Tri-mode Gigabit Ethernet</td>
</tr>
<tr>
<td><strong>Integrated Block Functionality</strong></td>
<td>General Connectivity</td>
<td>2xUSB 2.0, 2x SD/SDIO, 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIO</td>
<td>2xUSB 2.0, 2x SD/SDIO, 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIO</td>
<td>2xUSB 2.0, 2x SD/SDIO, 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIO</td>
<td>2xUSB 2.0, 2x SD/SDIO, 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIO</td>
</tr>
<tr>
<td><strong>Power Management</strong></td>
<td>Power Management</td>
<td>Full / Low / PL / Battery Power Domains</td>
<td>Full / Low / PL / Battery Power Domains</td>
<td>Full / Low / PL / Battery Power Domains</td>
<td>Full / Low / PL / Battery Power Domains</td>
</tr>
<tr>
<td><strong>Security</strong></td>
<td>Security</td>
<td>RSA, AES, and SHA</td>
<td>RSA, AES, and SHA</td>
<td>RSA, AES, and SHA</td>
<td>RSA, AES, and SHA</td>
</tr>
<tr>
<td><strong>AMS - System Monitor</strong></td>
<td>AMS - System Monitor</td>
<td>10-bit, 1MS/s - Temperature, Voltage, and Current Monitor</td>
<td>10-bit, 1MS/s - Temperature, Voltage, and Current Monitor</td>
<td>10-bit, 1MS/s - Temperature, Voltage, and Current Monitor</td>
<td>10-bit, 1MS/s - Temperature, Voltage, and Current Monitor</td>
</tr>
<tr>
<td><strong>PS to PL Interface</strong></td>
<td>PS to PL Interface</td>
<td>12 x 32/64/128b AXI Ports</td>
<td>12 x 32/64/128b AXI Ports</td>
<td>12 x 32/64/128b AXI Ports</td>
<td>12 x 32/64/128b AXI Ports</td>
</tr>
<tr>
<td><strong>RF Data Converter</strong></td>
<td>12-bit, 4GS/s ADC</td>
<td>0</td>
<td>8</td>
<td>8</td>
<td>8</td>
</tr>
<tr>
<td><strong>Subsystem</strong></td>
<td>12-bit, 2GS/s ADC</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>16</td>
</tr>
<tr>
<td></td>
<td>14-bit, 6.4GS/s DAC</td>
<td>0</td>
<td>8</td>
<td>8</td>
<td>16</td>
</tr>
<tr>
<td></td>
<td>24-bit, 1.2GS/s DAC</td>
<td>0</td>
<td>8</td>
<td>8</td>
<td>16</td>
</tr>
<tr>
<td><strong>Programmable Logic (PL)</strong></td>
<td>SD-EC</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>Functionality</strong></td>
<td>System Logic Cells (K)</td>
<td>930</td>
<td>680</td>
<td>930</td>
<td>930</td>
</tr>
<tr>
<td></td>
<td>CLB LUTs (K)</td>
<td>425</td>
<td>311</td>
<td>425</td>
<td>425</td>
</tr>
<tr>
<td></td>
<td>Max. Distributed RAM (Mb)</td>
<td>13.0</td>
<td>9.6</td>
<td>13.0</td>
<td>13.0</td>
</tr>
<tr>
<td></td>
<td>Total Block RAM (Mb)</td>
<td>38.0</td>
<td>27.8</td>
<td>38.0</td>
<td>38.0</td>
</tr>
<tr>
<td></td>
<td>Ultraram (Mb)</td>
<td>22.5</td>
<td>13.5</td>
<td>22.5</td>
<td>22.5</td>
</tr>
<tr>
<td><strong>Memory</strong></td>
<td>DSP Slices</td>
<td>4,272</td>
<td>3,168</td>
<td>4,272</td>
<td>4,272</td>
</tr>
<tr>
<td><strong>Integrated IP</strong></td>
<td>PCI Express® Gen 3x16 / Gen4x8</td>
<td>2</td>
<td>1</td>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td></td>
<td>150G Interlaken</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td>100G Ethernet MAC/PCS w/RS-FEC</td>
<td>2</td>
<td>1</td>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td></td>
<td>AMS - System Monitor</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td><strong>Speed Grades</strong></td>
<td>-1E, -1I, -1LI, -2LE, -2I</td>
<td>-1E, -1I, -1LI, -2LE, -2I, -3E</td>
<td>-1E, -1I, -1LI, -2LE, -2I, -3E</td>
<td>-1E, -1I, -1LI, -2LE, -2I, -3E</td>
<td>-1E, -1I, -1LI, -2LE, -2I, -3E</td>
</tr>
<tr>
<td><strong>Package Footprint</strong></td>
<td>Package Dimensions</td>
<td>PSIO, HDIO, HPIO GTR, GTY, ADC, DAC</td>
<td>PSIO, HDIO, HPIO GTR, GTY, ADC, DAC</td>
<td>PSIO, HDIO, HPIO GTR, GTY, ADC, DAC</td>
<td>PSIO, HDIO, HPIO GTR, GTY, ADC, DAC</td>
</tr>
<tr>
<td><strong>D1156</strong></td>
<td>35x35</td>
<td>214, 72, 208</td>
<td>4, 16, 0, 0</td>
<td>214, 48, 104</td>
<td>4, 8, 8, 8</td>
</tr>
<tr>
<td><strong>E1156</strong></td>
<td>35x35</td>
<td>214, 48, 104</td>
<td>4, 8, 8, 8</td>
<td>214, 48, 104</td>
<td>4, 8, 8, 8</td>
</tr>
<tr>
<td><strong>G1517</strong></td>
<td>40x40</td>
<td>214, 48, 299</td>
<td>4, 16, 8, 8</td>
<td>214, 48, 299</td>
<td>4, 16, 8, 8</td>
</tr>
<tr>
<td><strong>F1760</strong></td>
<td>42.5x42.5</td>
<td>214, 96, 312</td>
<td>4, 16, 16, 16</td>
<td>214, 96, 312</td>
<td>4, 16, 16, 16</td>
</tr>
</tbody>
</table>
Appendix

- Performance/Watt Benchmarks
- Integrated Data Converters in Zynq UltraScale+ RFSoCs
- Power Rails
- QEMU
- Programmable Logic Features
- Soft-Error Resilience (SEU)
Zynq UltraScale+ vs. Zynq-7000: 2-5X Perf/Watt

5X Performance/Watt
1080P Full HD ⇔ 4K2K UHD Video Conferencing
20% Less Power 4X Performance

4.8X Performance/Watt
Public-Safety Radio
47% Less Power 2.5X Performance

3.3X Performance/Watt
Automotive Multi-Camera, Driver Assist System
25% Less Power 2.5X Performance

Programmable Logic (Zynq-7000)
Memory I/O
Video Codec
Zynq-7000 SoC + H.264 ASSP
Zynq UltraScale+™ MPSoC

Programmable Logic (Zynq-7000)
Memory I/O
GPU + Cortex A9 (2X ASSP)
Application Processor (Zynq-7000)
Application Processor
Zynq-7000 SoC + 2X ASSP
Zynq UltraScale+™ MPSoC

Programmable Logic (with soft GPU)
Application Processors (2X Zynq-7000)
Memory I/O
Application Processor
2X Zynq-7000
Zynq UltraScale+™ MPSoC
## Integrated Direct-RF Data Converters in Zynq UltraScale+ RFSoC

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>4GSPs or 2GSPS ADCs with 12-bit Resolution 6.4GSPS DACs with 14-bit Resolution</td>
<td>• 4GHz of direct-RF bandwidth</td>
</tr>
<tr>
<td>RF-Sampling with Full DSP Subsystem</td>
<td>• RF-design in programmable digital domain, reducing external analog components • Full Digital Down-Conversion (DDC) and Up-Conversion (DUC) • Optionally bypass subsystem to programmable logic for custom mixing &amp; filtering</td>
</tr>
<tr>
<td>Based on 16nm FinFET+</td>
<td>• Optimal performance-per-watt and at least two process nodes ahead of latest generation discrete components</td>
</tr>
<tr>
<td>Dedicated Communication-Grade PLLs</td>
<td>• Leverage lower frequency external clock to drive high speed converters</td>
</tr>
<tr>
<td>Multi-Band Support</td>
<td>• Enable flexible carrier aggregation through a single RF signal chain</td>
</tr>
</tbody>
</table>

### Diagram:
- **Programmable Logic**
- **4GSPS ADCs**
- **6.4GSPS DACs**
- **RF Sampling**
- **Multi-Band Support**
- **Full DSP Subsystem on 16nm**
- **Band 1**
- **Band 2**
- **Low Frequency External Clock**
- **Internal PLL**

Optionally bypass subsystem
Flexible Power Management Architecture

Up to 8 Core Rails

- Full Power Domain Rails
  - \( V_{\text{CCINT}} \)
  - Core (FP)
  - DDR
  - DDR PLL
  - Core (LP)
  - Auxiliary (PS)
  - ADC
  - Core (PL)
  - Block RAM
  - I/O
  - Auxiliary (PL)
  - Auxiliary I/O
  - ADC

- Low Power Domain Rails
  - \( V_{\text{AUX}} \)
  - Core (FP)
  - DDR PLL
  - Core (LP)
  - Auxiliary (PS)
  - ADC
  - Core (PL)
  - Block RAM
  - I/O
  - Auxiliary (PL)
  - Auxiliary I/O
  - ADC

Up to 12 I/O Rails

- Processing System
  - Application Processing Unit
  - Memory
  - Graphics Processing Unit
  - High-Speed Connectivity
  - Real-Time Processing Unit
  - Security
  - Platform Management Unit
  - System Control
  - General Connectivity

- Programmable Logic
  - General Purpose I/O
  - Transceiver I/O

- Analog Supply
- Auxiliary Analog
- For Termination
- For Calibration
- High-Density I/O
- High-Perform. I/O
- Analog Supply
- Auxiliary Analog
- For Termination
- For Calibration
Xilinx QEMU Zynq UltraScale+ Support

Processing System

Application Processing Unit

- ARM® Cortex™-A53
- GIC
- SCU
- SMMU
- CCI
- 1 MB L2

Real-Time Processing Unit

- ARM Cortex-R5
- Vector Floating Point Unit
- Memory Protection Unit
- GIC
- TTC
- 128 KB TCM
- 32 KB I-Cache

DDR Controller

- DDR4/3/3L, LPDDR4/3 ECC Support
- GIC
- Memory Protection Unit
- DMA
- Timers & Resets
- Clocking
- Debug
- 256 KB OCM
- 1 MB L2

Graphics Processing Unit

- ARM Mali™-400 MP
- Graphics Processing Unit
- ARM Mali™-400 MP
- DDR4/3/3L, LPDDR4/3 ECC Support
- GIC
- Memory Protection Unit
- DMA
- Timers & Resets
- Clocking
- Debug
- 256 KB OCM
- 1 MB L2

Connectivity

- DisplayPort
- USB 3.0
- SATA 3.0
- PCIe Gen2
- PS-GTR
- GigE
- CAN
- UART
- I2C
- GPIO
- SPI
- Quad SPI NOR
- NAND
- SD/eMMC

Programmable Logic
Time to Market Advantage with QEMU Through Parallelism

Traditional Flow: HW ⇝ SW

1. HW Development
2. Early SW App Dev
3. Wait for HW
4. App SW Dev (Post-HW)
5. Integration & Test

Virtual Platform Flow: Parallel Dev

1. HW Development
2. QEMU Model Dev
3. Early App SW Dev
4. Parallel SW Dev
5. Early Firmware Completion
6. Integration & Test

Accelerated Time-to-Market
UltraScale Re-Architects the Core

**Highest Utilization at Maximum Performance**

### Next Generation Routing
- Re-designed routing architecture
- 2X routing, agile switching
- Co-Optimized with Vivado

### ASIC-Like Clocking
- Regional, segmented structure
- Flexible clock placement
- Scales w/density to balance skew

### System Logic Cells
- Higher utilization enabled by routing
- Shorter net delays for performance
- Less wire switching for lower power

---

**Effect of routing resources & analytical placement**

**Logic Cells** $O(N^2)$

**Interconnect tracks** $O(N)$
Tuned Process for Optimal Performance/Watt

Optimal Operating Voltage Selection

3D Gate allows for more channel surface area, achieving

- Faster transistor on/off switching speeds for greater performance
- Lower leakage and operating voltage for lower power

<table>
<thead>
<tr>
<th>Voltage</th>
<th>7 Series (28nm) $V_{NOM}$</th>
<th>UltraScale+ (16nm) $V_{NOM}$</th>
<th>UltraScale+ (16nm) $V_{LOW}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>1V</td>
<td>1.0x</td>
<td>1.6x</td>
<td>1.2x</td>
</tr>
<tr>
<td>0.85V</td>
<td>1.0x</td>
<td>.8x</td>
<td>.5x</td>
</tr>
<tr>
<td>0.72V</td>
<td>1.0x</td>
<td>.8x</td>
<td>.5x</td>
</tr>
</tbody>
</table>

Performance/Watt

- 1.0x
- 2x
- 2.4x
### Transceiver Portfolio in Zynq UltraScale+

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
</table>
| 6G (GTR) Transceivers | • Integrated in Processing System for direct access to key processing elements  
                      • Full PHY/IP compliance for key protocols: USB, SATA, DisplayPort, PCIe, Ethernet                                                   |
| 16G (GTH) Transceivers | • 16G backplane support, industry leading auto-adaptive equalization  
                      • Enables PCIe Gen4 (16G), JESD204B (12.5G), CPRI (16.3G), Serial Memory (HMC & MoSys)  
                      • Fractional PLL for multiple non-integer line rates and fabric clocks (eliminates clock components) |
| 33G (GTY) Transceivers | • 28Gb/s (CEI-25G-LR) backplane support for Nx100G to 400G systems  
                      • Support for Interlaken, OTU4 over CFP4, 802.3bj (28G Ethernet backplane)  
                      • Equivalent fractional PLL functionality as GTH transceivers                                                                 |

![Diagram of Zynq UltraScale+ processing and programmable logic](image_url)
# Integrated 100G Networking Cores

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
</table>
| 100G Ethernet & 150G Interlaken Cores | • More headroom for power budget  
• Lower latency and higher performance  
• Frees up logic for additional functionality  
• Simplified flow and easier routing for shorter run-times  
• No licensing requirements |

Multiple configuration options  
Flexible configurations [Lanes × Line Rate] to meet existing and future design requirements

## 100G Ethernet MAC/PCS

- IEEE 1588
- RX MAC
- TX and RX PCS
- Pause Processing
- TX MAC
- Status/Control/Config

## 150G Interlaken Rev 1.2

- IEEE 1588
- RX MAC
- TX and RX PCS
- Pause Processing
- TX MAC
- Status/Control/Config

## Configuration Options

<table>
<thead>
<tr>
<th>Hard IP</th>
<th>Lanes x Line Rate</th>
</tr>
</thead>
<tbody>
<tr>
<td>Interlaken</td>
<td>Up to 6 x 25G</td>
</tr>
<tr>
<td>Ethernet MAC</td>
<td>4 x 25G</td>
</tr>
</tbody>
</table>
## Enabling Massive External Memory Bandwidth

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>DDR4 support (up to 2,666 Mb/s)</td>
<td>40% higher data rates than DDR3</td>
</tr>
<tr>
<td>Can share I/O bank between two controllers</td>
<td>More efficient use of I/O banks for higher aggregate bandwidth</td>
</tr>
<tr>
<td>High performance integrated PHY architecture</td>
<td>Low latency and higher data rate at lower power</td>
</tr>
<tr>
<td>TX Pre-emphasis and RX linear equalization (CTLE)</td>
<td>• Optimizes channel margin for highest bandwidth</td>
</tr>
<tr>
<td></td>
<td>• Eases board design</td>
</tr>
</tbody>
</table>

### Feature Overview

- **Zynq-7000**: N/A, DDR3-1866 Mb/s, QDRII+, -
- **Zynq UltraScale+**: DDR4-2666Mb/s, DDR3-2133 Mb/s, QDR IV, LPDDR3

### Logic fabric

- Memory Controller 1
- Memory Controller 2

### Integrated PHY

- PHY data rate 1
- PHY data rate 2

### I/Os

- I/O Bank A
- I/O Bank B
- I/O Bank C

### Memory

- Adaptable, Scalable AXI4 Configuration
- up to 2 controllers per I/O bank (flexibility)
- low latency
- pre-emphasis (TX) & linear equalization (RX) for signal integrity

### Specifications

- **Zynq-7000**: N/A, DDR3-1866 Mb/s, QDRII+, -
- **Zynq UltraScale+**: DDR4-2666Mb/s, DDR3-2133 Mb/s, QDR IV, LPDDR3

### Chips

- **Zynq-7000**: N/A, DDR3-1866 Mb/s, QDRII+, -
- **Zynq UltraScale+**: DDR4-2666Mb/s, DDR3-2133 Mb/s, QDR IV, LPDDR3

### Data Rates

- DDR4-2666Mb/s
- DDR3-2133 Mb/s
- QDRII+ QDR IV
- LPDDR3

### Technology

- High performance integrated PHY architecture
- TX Pre-emphasis and RX linear equalization (CTLE)

### Benefits

- 40% higher data rates than DDR3
- More efficient use of I/O banks for higher aggregate bandwidth
- Low latency and higher data rate at lower power
- Optimizes channel margin for highest bandwidth
- Eases board design
## Parallel & Serial Memory Interface Enhancements

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefits &amp; Details</th>
</tr>
</thead>
</table>
| Building on Robustness at 20nm              | • Production-proven at 20nm, delivering operating margin across varying PVT  
• New embedded calibration routine for greater intelligence and clock-centering at the eye |
| DDR4-2666 in a Mid-Speed Grade Device        | Cost-effectively supports next generation DDR4 memory line rates                                                                                     |
| DDR4-2400 at Low Operating Voltage ($V_{\text{LOW}}$) | Up to 30% fabric power reduction (vs. 20nm) while leveraging mainstream DDR rates                                                                     |
| High Density DIMM Support (20nm & 16nm)     | • Enables support for dense, server-class DIMMs (8x capacity vs. 28nm)  
• Support for DIMMs based on 4-bit wide DRAM components  
• Independent calibration points for quad-rank support |
| Next Gen Transceivers for Serial Memories   | • Lower power, higher line rates and higher transceiver count in mid-range devices                                                                   |

**Serial Interfaces**
- 7 series Generation
- UltraScale Generation
- 15G & 30G HMC, MoSys
- Broad Support for Emerging Technologies

**DDR Interfaces**
- 7 series Generation
- UltraScale Generation
- 64GB  
  - 2 x Dual-Rank (16 x 8Gb)
- 8GB  
  - 2 x (8 x 4Gb)  
  - Dual-Rank
- 4GB  
  - (8 x 4Gb)
UltraRAM: New Memory Technology

- Wide, shallow FIFOs
- Shift registers
- State machines

Distributed RAM (bits to kilobits)

- Data/coefficient storage
- Deep FIFOs
- Shallow buffering

Block RAM (megabits)

- Deep packet buffering
- Video buffering
- State, statistics, counters

UltraRAM 10s of megabits with deterministic

External Memory (100s of megabits to gigabits)

Up to 27Mb (Zynq) to replace external memory for cost, power, performance
UltraRAM for Optimal Power, Flexibility, and Predictable Performance

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Massive Capacity</td>
<td>• From 288Kb (single block) up to 27Mb configurable (Zynq UltraScale+)</td>
</tr>
<tr>
<td></td>
<td>• Capable of replacing certain types of memory, e.g., SRAM, RLDRAM, TCAM</td>
</tr>
<tr>
<td>Predictable Performance</td>
<td>No use of fabric for 100% predictable latency when cascaded within a full column</td>
</tr>
<tr>
<td>Power Efficient</td>
<td>• Optimized for area and power efficiency</td>
</tr>
<tr>
<td></td>
<td>• Eliminates use of logic fabric while reducing routing congestion and power</td>
</tr>
<tr>
<td></td>
<td>• Granular power gating and management</td>
</tr>
<tr>
<td>Flexible</td>
<td>• Multiple forms of cascading (inputs, outputs, address bits) for diverse use models</td>
</tr>
<tr>
<td></td>
<td>• Optional input cascade/pipelines stages</td>
</tr>
</tbody>
</table>

Fully Cascadable Column
Replace Memory Components (e.g., SRAM, RLDRAM, TCAM)
# UltraRAM: Complementing Block RAM

## UltraRAM vs. Block RAM Comparison (Sub-Set)

**Different Capabilities for Different Use Models**

<table>
<thead>
<tr>
<th>Features</th>
<th>7 series Block RAM</th>
<th>UltraRAM</th>
</tr>
</thead>
<tbody>
<tr>
<td>Density per block</td>
<td>36K/18K</td>
<td>288K</td>
</tr>
<tr>
<td>Configurable Port Width</td>
<td>✓</td>
<td>-</td>
</tr>
<tr>
<td>Asynchronous Clocking</td>
<td>✓</td>
<td>-</td>
</tr>
<tr>
<td>Built-in FIFO</td>
<td>✓</td>
<td>-</td>
</tr>
<tr>
<td>ECC</td>
<td>✓</td>
<td>✓</td>
</tr>
<tr>
<td>Unused site gating</td>
<td>✓</td>
<td>✓</td>
</tr>
<tr>
<td>Sleep mode</td>
<td>✓</td>
<td>✓</td>
</tr>
<tr>
<td>Deep-sleep mode (3-clk cycle wake-up time)</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>Hardened data output cascading</td>
<td>-</td>
<td>Entire Column</td>
</tr>
<tr>
<td>Hardened data input &amp; address cascade</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>Optional input cascade/pipelines stages</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>Hardened address decoder</td>
<td>-</td>
<td>✓</td>
</tr>
</tbody>
</table>
UltraScale+ System Monitor (SysMon) for Safety, Reliability and Security

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefits</th>
</tr>
</thead>
<tbody>
<tr>
<td>Voltage, Current, &amp; Temperature Tracking</td>
<td>• Tracks voltages (PS, PL, external), currents, junction temp (PS/PL) for safe/secure/reliable operation</td>
</tr>
<tr>
<td></td>
<td>• Helps meet requirements for key industry standards (e.g., FIPS 140-2, IEC 61508, &amp; ISO26262)</td>
</tr>
<tr>
<td>Power Management &amp; Sensors</td>
<td>• Reduce power consumption through closed loop power management</td>
</tr>
<tr>
<td></td>
<td>• Leverages accurate power supply sensors (±1% max)</td>
</tr>
<tr>
<td>Plug-and-Play System Management</td>
<td>• Connects directly to existing system/power management infrastructure via I2C, PMBus or JTAG</td>
</tr>
<tr>
<td></td>
<td>• Flexible choice of any I/O bank for external analog inputs</td>
</tr>
<tr>
<td>Board Debug</td>
<td>Easy design instrumentation with wizards and IP catalog</td>
</tr>
</tbody>
</table>
## New Integrated PCIe Gen3x16 and Gen4x8 Block

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Gen3 x16 (8 Gb/s per lane)</td>
<td>Performance for today's high-end systems, e.g., 100G data center</td>
</tr>
<tr>
<td>Gen4 x8 (16 Gb/s per lane)</td>
<td>Enables next generation system topologies</td>
</tr>
<tr>
<td>Hardened SR-IOV (4 Physical, 252 Virtual Functions)</td>
<td>Expanded virtualization for demanding data center applications</td>
</tr>
</tbody>
</table>
| Increased Number of Tags | • 128 managed tags and 256 user managed tags  
• Enables more outstanding RD requests for greater system performance |
| New DMA IP | Complete end-to-end solution |

---

### UltraScale+ for Multi-100G Ports

- **FPGA**
  - Gen3 x16 or Gen4 x8
  - Gen3 x16 or Gen4 x8

- **PCIe (End Point)**
  - Gen3 x16 (8 Gb/s per lane)
  - Gen4 x8 (16 Gb/s per lane)

- **AXI Streaming**
  - Multi-Channel DMA

- **100G Application**

- **SR-IOV Interface**
  - AXI Streaming
  - AXI Memory Mapped

- **DDR3/4 Memory Controller**

### Gen2 PCIe (Zynq-7000) vs. UltraScale+ Comparison

<table>
<thead>
<tr>
<th>Features</th>
<th>Gen2 PCIe (Zynq-7000)</th>
<th>UltraScale+</th>
</tr>
</thead>
<tbody>
<tr>
<td>End-to-end CRC</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Advanced Error Reporting</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Tag Management</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>4 Physical Functions</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>252 Virtual Functions</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Hardened SR-IOV</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Less than 100ms config</td>
<td>✔</td>
<td>✔</td>
</tr>
</tbody>
</table>
## High Density (HD) I/O for Lower Power

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Power Optimized</td>
<td>Power &amp; area-efficient I/Os compared to High-Range (HR) &amp; High-Performance (HP) I/Os</td>
</tr>
<tr>
<td>Targeted Protocol Support</td>
<td>• 3.3 &amp; 2.5V legacy protocols, 1.2V – 1.8V single-ended support</td>
</tr>
<tr>
<td></td>
<td>• Cost effectively handles legacy interfaces without separate glue logic device</td>
</tr>
<tr>
<td></td>
<td>• Complements High Performance I/Os with little functional overlap</td>
</tr>
</tbody>
</table>

### Optimized for Target Functions

- DDR3
- MIPI D-PHY
- NAND
- DDR4
- 1000Base-X
- LVDS
- RLDRAM3
- DCI
- LVC莫斯 1.2/1.5/1.8V
- RGMII
- LVCMOS 2.5/3.3V
- I2C, SPI
- MDIO
- LVDS (input clk only)

### Power Efficient High Density I/Os

- High Performance I/O Bank
- High Density I/O Bank (Less Area)
- Clock Management Tile
- Processing System
Massive DSP Bandwidth for Diverse Applications

<table>
<thead>
<tr>
<th>Feature (20nm &amp; 16nm)</th>
<th>Benefit</th>
</tr>
</thead>
</table>
| 27x18 multiplier in a DSP slice; 35x28 support in a DSP tile (2 slices) | • Optimal performance per block  
• Implement double-precision floating point in 30% less fabric (vs. 7 series) |
| Pre-adder squaring | • Perform "sum-of-square-difference" calculations in 50% fewer resources  
• More efficient motion estimation in video applications |
| Extra accumulator feedback path | Implement complex multiply-accumulate in half the resources |
| Wide XOR | Implement EFEC, CRC, ECC functionality |
| White box modeling | Full visibility with accurate simulation and debug |
# Security and Reliability

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>AES-GCM decryption</td>
<td>NIST approved, faster configuration</td>
</tr>
<tr>
<td>DPA Counter Measures</td>
<td>Prevents the use of power or EM monitoring to extract keys</td>
</tr>
<tr>
<td>RSA-2048 authentication</td>
<td>Ensures design came from right source with no modification</td>
</tr>
<tr>
<td>Permanent Tamper Penalty</td>
<td>Prevents adversary from using security features of the device</td>
</tr>
<tr>
<td>Enhanced SEU Performance at 16nm</td>
<td>Increased reliability, availability, and elimination of silent data corruption</td>
</tr>
</tbody>
</table>
Up to 30X Greater Soft Error Resilience with Process, Architecture, Tools, Methodology, & IP

<table>
<thead>
<tr>
<th>Feature</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Multi-Generational Leadership</td>
<td>10+ years of SEU leadership and the only vendor with published SEU data</td>
</tr>
<tr>
<td>Silicon &amp; Manufacturing Process</td>
<td>Optimized process with strict material control for order of magnitude greater resilience</td>
</tr>
<tr>
<td>Patented Architectural Innovations</td>
<td>40+ patents (e.g., design/layout techniques, resilient implementation of control structures)</td>
</tr>
<tr>
<td>Error Detection &amp; Correction RAM</td>
<td>100% detection &amp; over 99.9% correction of Configuration RAM, Block RAM, and UltraRAM</td>
</tr>
<tr>
<td>Tools &amp; Methodology</td>
<td>Data-driven methodology with SEU Estimator for pre-design assessment and modeling</td>
</tr>
<tr>
<td>Soft-Error Mitigation (SEM) IP</td>
<td>IP for system-level testing and status reports for serviceability</td>
</tr>
</tbody>
</table>

![Diagram showing 30X Reduced FIT Rate and UltraScale Architecture vs. Previous Generation](image)