





### Flexible High Performance Architectures Based on MicroTCA.4 and RapidIO

Edward Young / Vollrath Dirksen

#### June 2015

edward.young@commagility.com

vollrath@nateurope.com





- Heterogenous Computing
  - ARM, DSP, FPGA, GPU etc
- Suitable Interconnect
  - low latency, efficient, endpoints, speed
  - Plus Ethernet (of course)
- Performance Density and reasonable cost
- System Management
- Suitable software architectures
- What about compatibility with other systems?





#### MicroTCA.4

- Used for front end data capture in Physics experiments
- This already contains the elements to build HPC
- Add Serial RapidIO
  - Currently Gen2 with all hardware available now
  - Standard option for MCH, dual star network to cards
  - Standardized software protocols for communication
- A range of processing options, e.g.
  - ARM+DSP with FPGA
  - High end FPGA
- Summary



#### slide (Nr.) I © 2015 CommAgility & N.A.T. GmbH I All trademarks and logos are property of their respective holders









## MicroTCA Architectural features

N. ......

- simple backplane architecture
  - reduces costs and risks, is re-useable in future
- all signals at same signal level (MLVDS)
  no electrical clash
- switched connections
  - no blocking transfer
  - type of backplane connection depends on kind of switch
- all slots managed and controlled
  - detection of incompatibilities and faults
  - health management and fault isolation
  - hot-swap and hot-plug

# MicroTCA Architecture Infrastructure component: power modules



- NAT-PM-DC420
- NAT-PM-DC840
- NAT-PM-AC600
- NAT-PM-AC600D
- NAT-PM-AC1000
- NAT-RPM-PSC

Input DC -48V Payload: 420W Input DC -48V Payload: 840W Input AC 110-265 Payload: 600W Input AC 110-265V Payload: 600W (double width) Input AC 110-265V Payload: 1000W (double width) Input AC 110-265V Payload: 600W (double width)

- Features:
  - monitoring of all 16 power channels
  - load sharing
  - n+1 redundancy
  - load bar



DC420 DC840 AC600 AC600D AC1000 RearPM (2015) for LLRF JTAG-Switch-Module **NAT-JSM** 

- MTCA.x compliant JTAG Switch Module
- Any AMC, CU, PM
- **Programming:** 
  - + Xilinx Connector
  - + USB
  - + MCH



Master auto detection









- MicroTCA.4
  - Used for front end data capture in Physics experiments
  - This already contains the elements to build HPC

#### Add Serial RapidIO

- Currently Gen2 with all hardware available now
- Standard option for MCH, dual star network to cards
- Standardized software protocols for communication
- A range of processing options, e.g.
  - ARM+DSP with FPGA
  - High end FPGA
- Summary

#### NAT-MCH: Single & Double Base, CLK, Fatpipe (PCIe, XAUI, SRIO), Custom



#### NAT-MCH SRIO-Submodule: Block-Diagram





#### MTCA.4 with 12 AMCs and 2 MCHs Redundant SRIO connections



N. .....



- MicroTCA.4
  - Used for front end data capture in Physics experiments
  - This already contains the elements to build HPC
- Add Serial RapidIO
  - Currently Gen2 with all hardware available now
  - Standard option for MCH, dual star network to cards
  - Standardized software protocols for communication
- A range of processing options, e.g.
  - ARM+DSP with FPGA
  - High end FPGA
- Summary

#### ARM+DSP+FPGA: AMC-D24A4





- ARM: 4 x A15 cores @ 1.4GHz
- DSP: 24 x C66x cores @ 1.2/1.25 GHz, plus built in accelerators (e.g. FFT)
- FPGA: Kintex-7 K325T with local PCIe to main SoC for acceleration
- Serial RapidIO: 2 off 6x4 20Gbps to backplane
- Other I/O: optical, GbE, timing, and 10GbE possible
- Mid-size AMC; Mezzanines possible if full-size
- Available now





- OpenCL and OpenMP support available
- Open SRIO drivers

# Linux Platform Support





Confidential



- Provides tools to diagnose and reprogram FPGA
- Examples provided to exercise card interfaces





- MicroTCA.4
  - Used for front end data capture in Physics experiments
  - This already contains the elements to build HPC
- Add Serial RapidIO
  - Currently Gen2 with all hardware available now
  - Standard option for MCH, dual star network to cards
  - Standardized software protocols for communication
- A range of processing options, e.g.
  - ARM+DSP with FPGA
  - High end FPGA
- Summary





- Roadmap product for CommAgility, currently in discussion with a lead customer for 2016
- An extremely high performance and flexible card for HPC
- FPGA into HPC and data centres is going mainstream, but this system can offer much better interconnect between FPGAs
- FPGA tools and C based synthesis are improving all the time

# Agility FPGA Card Block Diagram



 Two high end FPGAs (Ultrascale / Stratix 10): Approx 10M Logic cells per board, plus DSP, logic, memory etc



- MicroTCA.4
  - Used for front end data capture in Physics experiments
  - This already contains the elements to build HPC
- Add Serial RapidIO
  - Currently Gen2 with all hardware available now
  - Standard option for MCH, dual star network to cards
  - Standardized software protocols for communication
- A range of processing options, e.g.
  - ARM+DSP with FPGA
  - High end FPGA

#### Summary





- Let's consider a full MTCA.4 chassis
  - 19" rack mount, 7U high, 12 processing cards
  - Fully managed, reliable, front to rear cooling
- Filled with DSP/ARM cards
  - ~5.6 TFLOPS and 11.2 TMACS from DSP cores
  - ~210 Dhrystone MIPS from ARM cores
  - ~4M FPGA Logic cells plus 10K DSP slices
- Filled with FPGA cards
  - ~120M FPGA Logic cells plus 50K DSP slices
- All with 40Gbps of efficient, low latency Serial RapidIO connection to each card
- Any combination of the above, plus 3<sup>rd</sup> party cards
- Local HPC could be in same chassis as data acquisition

Summary: What do we need for HPC?



- Heterogenous Computing
  - We've shown ARM, DSP, FPGA. Standard based allows others
- Suitable Interconnect
  - Serial RapidIO
- Performance Density and reasonable cost
  - An efficiently packaged smaller system
- System Management
  - Good management and reliability is inherent in MTCA
- Suitable software architectures
  - Standardised interconnect protocols
  - Supporting OpenCL etc
- What about compatibility with other systems?
  - Links in with MTCA.4 data acquisition systems



#### ONE TECHNOLOGY MULTIPLE SOLUTIONS









# **Questions?**

#### **Edward Young**

edward.young@commagility.com

Commagility Charnwood Building Holywell Park Ashby Road Leicestershire Loughborough www.commagility.com

Vollrath Dirksen vollrath@nateurope.com

N.A.T. GmbH Konrad-Zuse-Platz 9 53227 Bonn, Germany **www.nateurope.com** 



MTCA.4 Basic 500, MTCA.4 Advanced ming 500, Convebinar