### Common SoC design platform and enablers

TWEPP 2021 Microelectronics user group meeting 05.10.2021

Risto Pejašinović Marco Andorno Alessandro Caratelli Kostas Kloukinas



### Motivation

Survey of Open-Source solutions

SoC platform selection status & plans



- Meet the challenges of future Front-End ASICs
  - Rapidly increasing logic complexity (among other)
  - Use of "digital centric" advanced technology nodes (28nm and beyond)
- On-detector data processing capability
  - Reducing the bandwidth of the optical links
- Reduce ASIC Development time and expedite Verification phase
  - Higher abstraction design level reusing verified common IP blocks
  - RTL-based designs lack portability and scalability across applications
- Reprogrammable functions
  - Functionality can be modified to match the field application requirements
  - Possibility to fix some bugs after tape-out

## SoC design platform and enablers

- System on Chip in the industry
  - Big momentum in the industry towards SoC
  - Focusing more on developing SW than HW

### SW vs HW

Pros:

- Shorter development time
- Possibility to fix bugs after tape-out
- Great amount of SW libraries for many applications

#### Cons:

- Custom HW approach is more flexible
- Better performance and energy efficiency possible in a custom hardware design



- Provide a flexible SoC platform to the designers
  - Scripted peripheral generation, interconnect generation...
- Provide software infrastructure for the SoC
  - HAL libraries and drivers for provided IP blocks and peripherals
  - Auto generation of C libraries for auto generated peripheral register files
- Standardize a Radiation Hard interconnect
  - Facilitate easier reuse of the IP blocks
  - Verification environments for the IP blocks respecting the protocol
  - Protocol checker VIP
- Radiation hardened CPU implementation







- Open source ISA, many open source implementations
- ISA designed for simplicity, "easy" to implement
- Very likely to take over various areas of CPU market
- Funded and developed by many big companies
- No license limitation, no vendor royalties
- Profit from community contributions

C SKY 22 aselsan BITMAIN AN DELPHIN GOWIN : CSEM PANGO Hill SiFive TE Tortuge SHC socuedies ----可应用于所有的计算设备的开窗,可把展的指令如 NVORKS IMPeras imt. 🔅 INTRINSICIO Cādence 🔤 NETRONGME NSI-TEXE STATE DEVER dxcorr DECOSM EMDALO TECHNOLOGIES Hewlett Packard galois 🔕 Minima 🖓 UBILITE OIDT ZASHLENG expressiogic antmicro @ surecore PertxLab uo TECH IP bluespec OCOTTUS @ Exercisio

SAMSUNG E

ANDES

QUALCOMM Google Trubia IBM Rambus (Acamer Account

A.A. Mellanov

Not all RiscV cores are with an open license

Blockstream



- EH1, EL2, EH2 from Western Digital (SV)
- VexRiscv (SpinalHDL)
- Rocket from UC Berkley (Chisel)
- PicoRV32 (Verilog)
- Zero-riscy (Ibex), RI5CY from ETH Zurich (SV)
- XuanTie CPUs from T-Head (Alibaba)
- SiFive CPUs from the creators of RiscV
- Many others...

- Apache 2.0
- $\mathsf{MIT}$
- BSD
- ISC
- Apache 2.0, Solderpad
- Commercial license
- Commercial license



- SweRVolf for WD cores
- Briey SoC from SpinalHDL
- Rocket Chip from UC Berkley
- PicoSoC with PicoRV32
- Pulpissimo, Pulpino, Pulp (ETH)
- LowRISC by LowRISC
- Litex support multiple CPUs
- Many others...

- Apache 2.0
- MIT
- BSD
- ISC
- Solderpad HW license
- BSD
- BSD



### Standardized interconnect bus

- Radiation hard interconnect
- Scripted generation and connection of peripherals
- Protocols Wishbone, AMBA (AXI, APB)

#### IP block development

- Library of IP blocks
- Standard IP blocks (I2C, SPI, UART, GPIO...)
- Hardware accelerators (sorting algorithms, matrix multipliers...)
- Automatic register file generation for the IP block
- Easy integration on the standardized interconnect







- Not all the signals are needed
- Possible to simplify
- High performance protocol
- 5 separate channels
- Separate read and write

| Global  | Write   | Write  | Write  | Read    | Read   |
|---------|---------|--------|--------|---------|--------|
| Signals | Address | Data   | Return | Address | Return |
| ACLK    | AWVALID | WVALID | BVALID | ARVALID | RVALID |
| ARESETN | AWREADY | WREADY | BREADY | ARREADY | RREADY |
|         | AWADDR  | WDATA  | BRESP  | ARADDR  | RRESP  |
|         | AWPROT  | WSTRB  |        | ARPROT  | RDATA  |
|         | AWID    |        | BID    | ARID    | RID    |
|         | AWLEN   | WLAST  |        | ARLEN   | RLAST  |
|         | AWSIZE  |        |        | ARSIZE  |        |
|         | AWBURST |        |        | ARBURST |        |
|         | AWLOCK  |        |        | ARLOCK  |        |
|         | AWCACHE |        |        | ARCACHE |        |
|         | AWQOS   |        |        | ARQOS   |        |
|         | AWUSER  | WUSER  | BUSER  | ARUSER  | RUSER  |

Burst transfers



- Simplified AXI4
- Bursts are only 1 beat
- Suitable for peripherals
- Removed many of the signals from AXI4
- Still memory mapped

| Write address<br>channel | Write data<br>channel | Write<br>response<br>channel | Read address<br>channel | Read data channel |
|--------------------------|-----------------------|------------------------------|-------------------------|-------------------|
| AWVALID                  | WVALID                | BVALID                       | ARVALID                 | RVALID            |
| AWREADY                  | WREADY                | BREADY                       | ARREADY                 | RREADY            |
| AWADDR                   | WDATA                 | BRESP                        | ARADDR                  | RDATA             |
| AWPROT                   | WSTRB                 |                              | ARPROT                  | RRESP             |



- Simple interface for low speed peripherals
- Non-pipelined
- Every transfer 2 cycles
- Need a bridge to AXI
- Easier to write a peripheral than the other protocols

| Signals |                      |  |
|---------|----------------------|--|
| PCLK    | CLK                  |  |
| PRESETn | RESET                |  |
| PADDR   | Address              |  |
| PWDATA  | Data                 |  |
| PRDATA  | Read Data            |  |
| PSELx   | Select slave         |  |
| PENABLE | Enable               |  |
| PWRITE  | Write (1), Read (0)  |  |
| PREADY  | Slave wait state     |  |
| PSLVERR | Succes (0), Fail (1) |  |



- One way flow from Master to Slave
- Data-intensive applications
- Not memory mapped
- Need a DMA to connect to AXI interconnect
- Suitable for HW accelerators that process a data stream.



| Signal  | Mandatory |
|---------|-----------|
| ACLK    | Yes       |
| ARESETn | Yes       |
| TVALID  | Yes       |
| TREADY  | Yes       |
| TDATA   | Yes       |
| TKEEP   | No        |
| TSTRB   | No        |
| TLAST   | No        |
| TIME    | No        |
| TDEST   | No        |
| TUSER   | No        |







- Developed by ETH Zurich
- Broad range of RISCV cores
- Several SoC platforms
- Support cluster architecture
- Custom RiscV extensions
- Written in SV
- We have a good communication with Pulp team from ETH



### ETH-Zurich R&D activities



#### **PULP Features**

- efficient implementations of RISC-V cores. These include:
  - > 32 bit 4-stage core CV32E40P (formerly RISCY)
  - 64 bit 6-stage CVA6 (formerly Ariane)
  - 32-bit 2-stage lbex (formerly Zero-risey)
- complete systems based on:
  - single-core micro-controllers (PULPissimo, PULPino)
  - multi-core IoT Processors (OpenPULP)
  - multi-cluster heterogeneous accelerators (Hero)
- open-source SolderPad license
  - a perpetual, worldwide, non-exclusive, no-charge, royalty-free, Irrevocable license
- rich set of peripherals
- I2C, SPI, HyperRAM, GPIO

#### The Parallel Ultra Low Power (PULP) Platform

- joint effort between the <u>Integrated Systems</u> <u>Laboratory (IIS)</u> of ETH Zürich and <u>Energy-efficient</u> <u>Embedded Systems (EEES)</u> group of the University of Bologna in 2013 to explore new and efficient architectures for ultra-low-power processing.
- Aim: to develop an open, scalable hardware and software research and development platform for low power applications
  - Single core microcontroller units
  - Parallel ultra low power programmable architectures
- Open-source approach
  - Based on the open-source RISC-V instruction set architecture
- PULP Team
  - Prof. Luca Benini
  - Frank K. Gürkaynak
  - and many more....https://pulp-platform.org/team.html



- Minimal RiscV core, suitable for control algorithms
- Maintained by LowRISC and named Ibex
- 2 stage pipeline
- RV32IMC instruction set
- No PULP extensions





- DSP application oriented
- Maintained by OpenHW group under the name CV32E40P
- 4 stage pipeline
- With Xpulp extension for performance and efficiency
- RV32IMFCXpulp extensions





- GNU compiler toolchain available
- Supports newlib (embedded) and glibc (linux) libraries
- Linux kernel
- U-Boot
- SPIKE Risc-V ISA Simulator
- QEMU
- OpenOCD
- Several Linux distributions
- FreeRTOS, Zephyr...
- Many more...
- Pulp CPUs have custom extensions called XPulp



- ZeroRiscy or RI5CY
- Interleaved memory and cluster interconnect
- Custom Logarithmic interconnect for memory, CPU, cluster...
- Peripherals connected to a monolithic uDMA
- Extendible AXI interconnect for peripherals
- APB interconnect for peripherals





- Remove the uDMA and add AXI compatible peripherals
- Remove interleaved memories, keep a simple memory architecture
- External clock instead of FLL
- Remove cluster infrastructure
- Possibly replace the logarithmic interconnect with AXI
- Use ZeroRiscy as a CPU
- JTAG master, SPI, I2C, UART, GPIO
- Remove the old debug unit
- Triple module redundancy radiation hardening





- Pulpissimo tested on the Genesys 2 FPGA board
- Tested the Pulp toolchains, and verified that the compiled code runs on real hardware
- SoC is modified according to our needs.
- Some modifications are still in progress
- In process of implementing Pulpissimo on 28nm with CERN design flow





- Submission on 16. December
- Non radiation hard SoC
- Modified Pulpissimo
- Radiation tolerant SoC planned next year
- Intent to create a simple usable Microcontroller



- SoC RAdiation Tolerant EcoSystem
- Radiation tolerant SoC builder
- Radiation tolerant CPU
- Standardized interconnect based on AMBA standards
- Library of verified radiation tolerant IPs (I2C, SPI, UART, JTAG...)
- Automatically generated AXI peripherals
- Custom Hardware accelerators

## KU Leuven Research Activities

- High-Performance fault-tolerant RISC-V microprocessors for harsh environments.
  - > Error detection and correction at the architecture level
  - Digital implementation with error detection flipflops
  - Instruction replay for correction
- Fault tolerance systolic array Deep Neural Network (DNN) accelerator.
  - Effect of radiation on the classification accuracy and functionality of the accelerator
  - Implement area efficient fault tolerance methods
- Impact of aging degradation on the radiation susceptibility of integrated circuits in advanced FD-SOI technology nodes.
  - Design test vehicles while take into consideration the increasing stochastic nature of degradation mechanisms in advanced nodes: Programmable Arrays.
  - Model the coupled effect of radiation and aging mechanisms on FD-SOI technology

### KU LEUVEN

Team leader: Prof. Jeffrey Prinzie (jeffrey.prinzie@kuleuven.be) Principal researchers: Karel Appels, Mohamed Mounir, and Naïn Jonckers



Heavy-ion irradiation test at RADEF for ProArray Chips: a Custom designed test vehicle in 28 nm FD-SOI

# Thank you!!