# **Advanced FPGA design**

#### ISOTDAQ 2016 @ Rehovot (Israel) 29/01/2016



**Manoel Barros Marin** 

Advanced FPGA design ISOTDAQ 2016 @ Rehovot (Israel) 29/01/2016

Outline:

- ... from the previous lesson
- Key concepts about FPGA design
- FPGA gateware (firmware) design work flow
- Summary





Outline:

## • ... from the previous lesson

- Key concepts about HDL
- FPGA gateware (firmware) design work flow

• Summary



Manoel Barros Marin

What is an FPGA : Field Programmable Gate Array?

#### What is an FPGA : Field Programmable Gate Array?

FPGA - Wikipedia https://en.wikipedia.org/wiki/Field-programmable\_gate\_array
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

#### What is an FPGA : Field Programmable Gate Array?

FPGA - Wikipedia https://en.wikipedia.org/wiki/Field-programmable\_gate\_array A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".



- FPGA fabric (matrix like structure) made of:
  - I/O-cells to communicate with outside world
  - Logic cells
    - $\circ~$  Look-Up-Table (LUT) to implement combinatorial logic
    - $_{\odot}~$  Flip-Flops (D) to implement sequential logic
  - Interconnect network between logic resources
  - Clock tree to distribute the clock signals





• But it also features Hard Blocks:



Advanced FPGA design ISOTDAQ 2016 @ Rehovot (Israel) 29/01/2016

Outline:

• …from the previous lesson

## • Key concepts about HDL

• FPGA gateware (firmware) design work flow

#### • Summary



## Key concepts about FPGA design

FPGA gateware (firmware) design is NOT programming

## Key concepts about FPGA design

#### FPGA gateware (firmware) design is NOT programming



#### • Programming

- Code is written and translated into instructions
- Instructions are executed sequentially by the CPU(s)
- Parallelism is achieved by running instructions on multiple threads/cores
- Processing structures and instructions sets are fixed by the architecture of the system

#### VS.

- FPGA gateware (firmware) design
  - No fixed architecture, the system is built according to the task
  - Building is done by describing/defining system elements and their relations
  - Intrinsically parallel, sequential behaviour is achieved by registers and Finite-State-Machines (FSMs)
  - Description done by schematics or a hardware description language (HDL)

HDL are used for describing <u>HARDWARE</u>

#### HDL are used for describing <u>HARDWARE</u>

• Example of a WAIT statement (Programming Language VS. HDL)

#### HDL are used for describing <u>HARDWARE</u>

- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)
     sleep(5); // sleep 5 seconds
  - In HDL (e.g. VHDL):
    - Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```

• Synthesizable (for simulation and/or FPGA implementation)

```
simple delay counter : process (delay rst, delay clk, delay ena)
begin -- process
 if delay rst = '1' then
    s count <= delay ld value;</pre>
    s delay done <= '0';</pre>
 elsif rising edge(delay clk) then
    if delay ena = '1' then
      if delay_ld = '1' then
        s count <= delay ld value;</pre>
      else
        s count <= s count - 1;</pre>
      end if:
    end if;
    if s count = 0 then
      s delay done <= '1';</pre>
    else
      s_delay_done <= '0';</pre>
    end if;
 end if;
 end process;
```



#### HDL are used for describing HARDWARE

- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>) sleep(5); // sleep 5 seconds
  - In HDL (e.q. VHDL): ۲
    - Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```

Synthesizable (for simulation and/or FPGA implementation)

```
simple delay counter : process (delay rst, delay clk, delay ena)
begin -- process
 if delay_rst = '1' then
    s count <= delay ld value;</pre>
    s delay done <= '0';</pre>
 elsif rising edge(delay clk) then
    if delay ena = '1' then
      if delay ld = '1' then
        s count <= delay ld value;</pre>
      else
        s count <= s count - 1;</pre>
      end if:
    end if;
    if s count = 0 then
      s delay done <= '1';</pre>
    else
      s delay done <= '0';</pre>
    end if:
 end if:
 end process;
                     HDL to RTL
```

# SystemVeril

#### **Register Transfer Level (RTL)**

http://en.wikipedia.org/wiki/Register-transfer\_level A design abstraction which models a synchronous digital circuit in terms of the flow of digital signals (data) between registers and logical operations performed on those signals

#### HDL are used for describing <u>HARDWARE</u>

SystemVeril

- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)
     sleep(5); // sleep 5 seconds
  - In HDL (e.g. VHDL):
    - o Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```

o Synthesizable (for simulation and/or FPGA implementation)



#### HDL are used for describing <u>HARDWARE</u>

- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)
     sleep(5); // sleep 5 seconds
  - In HDL (e.g. VHDL):
    - o Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```



o Synthesizable (for simulation and/or FPGA implementation)



#### HDL are used for describing <u>HARDWARE</u>

- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)
     sleep(5); // sleep 5 seconds
  - In HDL (e.g. VHDL):
    - o Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```



o Synthesizable (for simulation and/or FPGA implementation)



**<u>Timing</u>** in FPGA gateware (firmware) design is critical

#### **<u>Timing</u>** in FPGA gateware (firmware) design is critical

• Data propagates in the form of electrical signals through the FPGA



#### **<u>Timing</u>** in FPGA gateware (firmware) design is critical

• Data propagates in the form of electrical signals through the FPGA



If these signals do not arrive to their destination on time...

The consequences may be catastrophic!!!

# When designing FPGA gateware (firmware) you have to think <u>HARD</u>...

# When designing FPGA gateware (firmware) you have to think <u>HARDWARE</u>

Advanced FPGA design ISOTDAQ 2016 @ Rehovot (Israel) 29/01/2016

Outline:

- …from the previous lesson
- Key concepts about HDL
- FPGA gateware (firmware) design work flow
- Summary



Manoel Barros Marin



**Project Specification** 

This is the most critical step...

#### **Project Specification**

#### This is the most critical step...

• Gather requirements from the users The rest of the design process is based on it!!!

#### **Project Specification**

## This is the most critical step...

- Gather requirements from the users
- Specify:
  - Target application (Specific or General purpose)



Example of General Purpose Gateware

#### **Project Specification**

## This is the most critical step...

• Gather requirements from the users

The rest of the design process is based on it!!!

- Specify:
  - Target application (Specific or General purpose)



#### Example of Application Specific Gateware

#### **Project Specification**

#### This is the most critical step...

• Gather requirements from the users

The rest of the design process is based on it!!!

- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)



#### Example of FPGA Architecture

#### **Project Specification**

## This is the most critical step...

- Gather requirements from the users
- The rest of the design process is based on it!!!

Xilinx

- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)



#### **Project Specification**

## This is the most critical step...

- Gather requirements from the users
- The rest of the design process is based on it!!!

- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))



Example of COST board (Xilinx Devkit)



#### **Project Specification**

## This is the most critical step...

- Gather requirements from the users
- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)

#### Example of FPGA Vendor Tools



#### Example of Commercial Tools



#### **Project Specification**

## This is the most critical step...

• Gather requirements from the users

- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)

#### **Project Specification**

## This is the most critical step...

• Gather requirements from the users

- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)



#### **Project Specification**

## This is the most critical step...

• Gather requirements from the users

- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)





### **Project Specification**

# This is the most critical step...

- Gather requirements from the users
- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)







### **Project Specification**

# This is the most critical step...

- Gather requirements from the users
- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)
  - Design language (Schematics or HDL (e.g. VHDL, etc.))

# HDL are the most popular for RTL design but...

### Schematics may be better in some cases (e.g. Embedded Processor configuration, etc..)

# The rest of the design process is based on it!!!

#### Examples of Design Languages

| 31 |                            |
|----|----------------------------|
| 32 | entity XuLA_2 is           |
| 33 | Port ( PB1 : in STD_LOGIC; |
| 34 | PB2 : in STD_LOGIC;        |
| 35 | PB3 : in STD_LOGIC;        |
| 36 | PB4 : in STD_LOGIC;        |
| 37 | LED1 : out STD_LOGIC;      |
| 38 | LED2 : out STD_LOGIC;      |
| 39 | LED3 : out STD_LOGIC;      |
| 40 | LED4 : out STD_LOGIC);     |
| 41 | end XuLA_2;                |
|    |                            |



### **Project Specification**

## This is the most critical step...

• Gather requirements from the users

The rest of the design process is based on it!!!

- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)
  - Design language (Schematics or HDL (e.g. VHDL, etc.))
  - Coding convention

#### Example of Coding Convention

### Your code should be readable

| description     | extension | example  |  |  |  |
|-----------------|-----------|----------|--|--|--|
| variable        | prefix v  | v_Buffer |  |  |  |
| alias           | prefix a  | a_Bit5   |  |  |  |
| constant        | prefix c  | c_Lenght |  |  |  |
| type definition | prefix t  | t_MyType |  |  |  |
| generics        | prefix g  | g_Width  |  |  |  |

### **Project Specification**

# This is the most critical step...

- Gather requirements from the users
- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)
  - Design language (Schematics or HDL (e.g. VHDL, etc.))
  - Coding convention
  - Software interface (GUI, Scripts or both)

#### Example of TCL script

| 45 #################################### |                                                                   |                                             |                |                    |         | Xilinx ISE TCL console |                       |    |  |
|-----------------------------------------|-------------------------------------------------------------------|---------------------------------------------|----------------|--------------------|---------|------------------------|-----------------------|----|--|
|                                         | **************************************                            | GDI-FPGA CO                                 | )TG ########## | *********          |         |                        |                       |    |  |
| 47<br>48                                | **************************************                            | Tcl Console                                 |                |                    |         |                        |                       |    |  |
| 10<br>49<br>50                          | ## Comment: Adding Common files:                                  |                                             |                |                    |         |                        |                       |    |  |
| 51                                      | puts "->"                                                         |                                             |                |                    |         |                        |                       |    |  |
| 52                                      | puts "-> Adding common files of the GBT-FPGA Core to the ISE pro  |                                             |                |                    |         |                        |                       |    |  |
| 53<br>54                                | puts "->"                                                         |                                             |                |                    |         |                        |                       |    |  |
| 55                                      | xfile add \$SOURCE PATH/gbt bank/core sources/gbt rx/gbt rx.vhd   |                                             |                |                    |         |                        |                       |    |  |
| 56                                      | xfile add \$SOURCE PATH/gbt bank/core sources/gbt rx/gbt rx decor | Command>                                    | vtdah aht faan | tel viliov vietov7 |         |                        |                       |    |  |
| 57                                      | xfile add \$SOURCE PATH/gbt bank/core sources/gbt rx/gbt rx decou | $100000000 \ge 100000000000000000000000000$ |                |                    |         |                        |                       | 1  |  |
|                                         |                                                                   | Console                                     | Errors         | 🔬 Warnings         | Td Cons | ole 🕅                  | Find in Files Results | 40 |  |



### **Project Specification**

# This is the most critical step...

• Gather requirements from the users

- Specify:
  - Target application (Specific or General purpose)
  - Main features (e.g. System bus, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Altera, Microsemi, Lattice, etc.)
  - Electronic board (Custom or COST (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)
  - Design language (Schematics or HDL (e.g. VHDL, etc.))
  - Coding convention
  - Software interface (GUI, Scripts or both)
  - Use of files repository (SVN, GIT, etc.. or none)



### **Project Specification**

### This is the most critical step...

• Block diagram of the system

- Include the FPGA logic...
- ... but also the on-board devices and related devices
- May combine different abstraction levels



### **Project Specification**

## This is the most critical step...

#### Pin planning •

Pin assignments are one type of Location Constraints

### Critical for Custom Boards!!!



# The rest of the design process is based on it!!!

Example of Pin Planner GUI

\_ | **D** | X

# FPGA gateware (firmware) design work flow Design Entry



### Design Entry: <u>Modularity & Reusability</u>

- Your system should be Modular
  - Design at RTL level (think hard...ware)
  - Well defined clocks and resets schemes
  - Separated Data & Control paths
  - Multiple instantiations

#### Good example of Modular System



#### • Your code should be Reusable

- Add primitives (and modules) to the system by inference when possible
- Use parameters in your code (e.g. generics in VHDL, parameters in Verilog, etc.)
- Centralise parameters in external files (e.g. packages in VHDL, headers in Verilog, etc.)
- Use configurable modules interfaces when possible (e.g. parametrised vectors, records in VHDL, etc.)
- Use standard features (e.g. I2C, Wishbone, etc.)
- Use standard IP Cores (e.g. from <a href="https://www.OpenCores.org">www.OpenCores.org</a>, etc.)
- Avoid vendor specific IP Cores when possible
- Talk with your colleagues and see what other FPGA designers are doing

Design Entry: <u>Coding for Synthesis</u>

# Synthesizable code is intended for FPGA implementation

• Use non-synthesizable HLD statements only in simulation test benches

A fundamental guiding principle when coding for synthesis is to minimize, if not eliminate, all structures and directives that could potentially create a mismatch between simulation and synthesis. From book "Advanced FPGA Design" by Steve Kilts (Copyright © 2007 John Wiley & Sons, Inc.)

Design Entry: <u>Coding for Synthesis</u>

# Synthesizable code is intended for FPGA implementation

• Use non-synthesizable HLD statements only in simulation test benches

A fundamental guiding principle when coding for synthesis is to minimize, if not eliminate, all structures and directives that could potentially create a mismatch between simulation and synthesis. From book "Advanced FPGA Design" by Steve Kilts (Copyright © 2007 John Wiley & Sons, Inc.)

• The RTL synthesis tool is expecting a synchronous design...

Design Entry: <u>Coding for Synthesis</u>

# Synthesizable code is intended for FPGA implementation

• Use non-synthesizable HLD statements only in simulation test benches

A fundamental guiding principle when coding for synthesis is to minimize, if not eliminate, all structures and directives that could potentially create a mismatch between simulation and synthesis. From book "Advanced FPGA Design" by Steve Kilts (Copyright © 2007 John Wiley & Sons, Inc.)

• The RTL synthesis tool is expecting a synchronous design...

But what is a synchronous design???



Design Entry: <u>Coding for Synthesis</u>

# Synthesizable code is intended for FPGA implementation

• Use non-synthesizable HLD statements only in simulation test benches

A fundamental guiding principle when coding for synthesis is to minimize, if not eliminate, all structures and directives that could potentially create a mismatch between simulation and synthesis. From book "Advanced FPGA Design" by Steve Kilts (Copyright © 2007 John Wiley & Sons, Inc.)

• The RTL synthesis tool is expecting a synchronous design...

Synchronous design is the one compose by combinatorial logic (e.g. logic gates, multiplexors, etc..) and sequential logic (registers that are triggered on the edge of a single clock),



### Design Entry: <u>Coding for Synthesis</u>

- Combinatorial logic coding rules
  - Sensitivity list must include ALL input signals
     Not respecting this may lead to non responsive outputs under changes of input signals
  - ALL output signals must be assigned under ALL possible input conditions Not respecting this may lead to undesired latches (asynchronous storage element)
  - No feedback from output to input signals
     Not respecting this may lead to unknown output states (metastability) & undesired latches

#### Good combinatorial coding for synthesis



#### Output\_Q <= Output\_nand and Input\_C and Output\_nor; end process;

#### Bad combinatorial coding for synthesis



### Design Entry: <u>Coding for Synthesis</u>

- Sequential logic coding rules
  - Only clock signal (and asynchronous set/reset signals when used) in sensitivity list Not respecting this may produce undesired combinatorial logic
  - All registers of the sequence must be triggered by the same clock edge (either Rising or Falling) Not respecting this may lead to metastability at the output of the registers
  - Include all registers of the sequence in the same reset branch
     Not respecting this may lead to undesired register values after reset

#### Good sequential coding for synthesis

```
process(Clk,Rst)
                                       Rst
begin
    if (Rst = '1') then
                                                    R
                                                                  R
                                                      0
                                                                   0
        Output Out <= '0';
                                                               D
                                                                         •Out
                                       In
        Output Q
                  <= '0';
    elsif rising edge(Clk) then
        Output Out <= Output Q;
                                                   Clk
                                       Clk-
                                                                  Clk
        Output Q <= Input In;
    end if:
end process;
```

#### Bad sequential coding for synthesis



### Design Entry: <u>Coding for Synthesis</u>

- Synchronous design coding rules:
  - FULLY synchronous design
    - $\circ$  No combinatorial feedback
    - o No asynchronous latches

Not respecting this may lead to incorrect analysis from the FPGA design tool

- Register ALL output signal Not respecting this may lead to uncontrolled length of combinatorial paths
- Properly design of reset scheme (mentioned later)
   Not respecting this may lead to undesired register values after reset
- Properly design of clocking scheme (mentioned later)
   Not respecting this may lead to metastability at the output of the registers & Misuse of resources
- Properly handle Clock Domain Crossings (CDC) (mentioned later) Not respecting this may lead to metastability at the output of the registers



## Design Entry: <u>Coding for Synthesis</u>

- Finite State Machines (FSMs):
  - Digital logic circuit with a finite number of internal states
  - Widely used for system control
  - Two variants of FSM
    - $\circ$   $\,$  Moore: Outputs depends only on the current state of the FSM  $\,$
    - Mealy: Outputs depends only on the current state of the FSM as well as the current values of the inputs
  - Modelled by State Transition Diagrams



- Many different FSM coding styles (But not all of the are good!!)
- FSM coding considerations:
  - o Synchronize inputs & outputs
  - $\circ$   $\,$  Outputs may be assigned during states or state transitions  $\,$
  - Be careful with unreachable/illegal states
  - o You can add counters to FSMs

### Design Entry: <u>Reset Scheme</u>

A bad reset scheme may get you crazy!!!

- Used to initialize the output of the registers to a know state
- It has a direct impact on:
  - Performance
  - Logic utilization
  - Reliability
- Different opinions due to...
- Different approaches:
  - o Asynchronous

**Pros:** No free running clock required, easier timing closure

**Cons: s**kew, glitches, simulation mismatch, difficult to debug, extra constraints, etc.

o Synchronous

Pros: No Skew, No Glitches, No simulation mismatch, Easier to debug, No extra constraints, etc..

**Cons:** Free-running clock required, More difficult timing closure

 $\circ$   $\,$  No Reset Scheme

**Pros;** Easier Routing, Less resources, Easiest timing closure

Cons: Only reset at power up (in some devices not even that...) <- In fact, reset is not always needed

o Hybrid: Usually in big designs (Avoid when possible!!!)

### Design Entry: <u>Reset Scheme</u>

A bad reset scheme may get you crazy!!!

- Used to initialize the output of the registers to a know state
- It has a direct impact on:
  - Performance
  - Logic utilization
  - Reliability
- Different opinions due to...
- Different approaches:
  - o Asynchronous
    - **Pros:** No free running clock required, easier timing closure

**Cons: s**kew, glitches, simulation mismatch, difficult to debug, extra constraints, etc.

o Synchronous

Pros: No Skew, No Glitches, No simulation mismatch, Easier to debug, No extra constraints, etc..

**Cons:** Free-running clock required, More difficult timing closure

 $_{\odot}$   $\,$  No Reset Scheme

**Pros;** Easier Routing, Less resources, Easiest timing closure

Cons: Only reset at power up (in some devices not even that...) <- In fact, reset is not always needed

o Hybrid: Usually in big designs (Avoid when possible!!!)

# My advise is... You should use SYNCHRONOUS RESET by default

### Design Entry: <u>Clocks Scheme</u>

# Clocking resources are very precious!!!

- Clock regions
- Clock trees (Global & Local)
- Other FPGA clocking resources
  - Clock capable pins
  - Clock buffers
  - Clock Multiplexors
  - PLLs & DCM



• Bad practices when designing your clocking scheme



FF



• Clock Domain Crossing (CDC)



• Clock Domain Crossing (CDC)



- Clock Domain Crossing (CDC): <u>The problem...</u>
  - Clock Domain Crossing (CDC) : passing a signal from one clock domain to another (A to B)
  - If clocks are unrelated to each other (asynchronous) timing analysis is not possible
  - Setup and Hold times of FlipFlop B are likely to be violated -> Metastability!!!



### Avoid creating unnecessary clock domains

• Clock Domain Crossing: <u>The workaround...</u>

- Clock Domain Crossing: <u>The workaround...</u>
  - Control Paths:







- Clock Domain Crossing: <u>The workaround...</u>
  - **Control Paths:** ulletDPRAM Handshaking Synchronizers Din Dout Rst-VALID RO DRO -Out Wr En In · Rr En READY "1" **BLOCK A BLOCK B** Wr Addr [0...0] • Rd Addr [0...0] DATA Clk-Clk Clk Wr Clk Rd Clk Data Paths: • Handshaking Din Dout VALID Asynchronous Wr En Rr En READY **BLOCK B** FIFO **BLOCK A** Full Empty DATA, Wr Clk Rd Clk

Be aware of FIFO overflow/underflow!!!

• Clock Domain Crossing: <u>The workaround...</u>



### Design Entry: <u>Primitives & IP Cores</u>

- **Primitives:** Basic components of the FPGA
  - Vendor (and device) specific
  - Examples: Buffers (I/O & Clock), Registers, BRAMs, DSP blocks, Logic Gates (programed LUTs)
- Hard IP Cores: Complex hardware blocks embedded into the FPGA
  - Vendor (and device) specific
  - Fixed I/O location
  - In many cases they may be set through GUI (Wizards)
  - Examples: : PLLs, Multi-gigabit Transceivers, Ethernet MAC, Microprocessors, etc..
- Soft IP Cores: Complex (or simple) modules ready to be implemented
  - They may be vendor specific or agnostic:
    - Vendor Specific: Encrypted Code or Requires Hard IP Core
    - Vendor Agnostic: Commercial or Open Source (www.OpenCores.org)
  - In many cases they may be set through GUI (Wizards)
  - Examples: : All kind of modules
- Two ways of adding Primitives & IP Cores to your system:
  - Instantiation: The module is EXPLICITELLY added to the system
  - Inference: The module is IMPLICITELLY added to the system

#### Instantiated FlipFlop (for Microsemi ProAsic3)

```
DFN1C1 FlipFlop (
	.D (Input_D),
	.CLK (Clk),
	.CLR (Rst),
	.Q (Output_Q));
```

#### Inferred FlipFlop (Verilog)

```
always @(posedge Clk or posedge Rst)
begin
    if (Rst)
        Output_Q <= 0;
    else
        Output_Q <= Input_D;
    end</pre>
```

### Design Entry: <u>Primitives & IP Cores</u>

- **Primitives:** Basic components of the FPGA
  - Vendor (and device) specific
  - Examples: Buffers (I/O & Clock), Registers, BRAMs, DSP blocks, Logic Gates (programed LUTs)
- Hard IP Cores: Complex hardware blocks embedded into the FPGA
  - Vendor (and device) specific
  - Fixed I/O location
  - In many cases they may be set through GUI (Wizards)
  - Examples: : PLLs, Multi-gigabit Transceivers, Ethernet MAC, Microprocessors, etc..
- Soft IP Cores: Complex (or simple) modules ready to be implemented
  - They may be vendor specific or agnostic:
    - Vendor Specific: Encrypted Code or Requires Hard IP Core
    - Vendor Agnostic: Commercial or Open Source (www.OpenCores.org)
  - In many cases they may be set through GUI (Wizards)
  - Examples: : All kind of modules
- Two ways of adding Primitives & IP Cores to your system:
  - Instantiation: The module is EXPLICITELLY added to the system
  - Inference: The module is IMPLICITELLY added to the system

#### Instantiated FlipFlop (for Microsemi ProAsic3)

```
DFN1C1 FlipFlop (
	.D (Input_D),
	.CLK (Clk),
	.CLR (Rst),
	.Q (Output_Q));
```

66

#### Inferred FlipFlop (Verilog)

```
always @ (posedge Clk or posedge Rst)
begin
    if (Rst)
        Output_Q <= 0;
    else
        Output_Q <= Input_D;
    end</pre>
```



### Design Entry: <u>Primitives & IP Cores</u>

- **Primitives:** Basic components of the FPGA
  - Vendor (and device) specific
  - Examples: Buffers (I/O & Clock), Registers, BRAMs, DSP blocks, Logic Gates (programed LUTs)
- Hard IP Cores: Complex hardware blocks embedded into the FPGA
  - Vendor (and device) specific
  - Fixed I/O location
  - In many cases they may be set through GUI (Wizards)
  - Examples: : PLLs, Multi-gigabit Transceivers, Ethernet MAC, Microprocessors, etc..
- Soft IP Cores: Complex (or simple) modules ready to be implemented
  - They may be vendor specific or agnostic:
    - o Vendor Specific: Encrypted Code or Requires Hard IP Core
    - Vendor Agnostic: Commercial or Open Source (www.OpenCores.org)
  - In many cases they may be set through GUI (Wizards)
  - Examples: : All kind of modules
- Two ways of adding Primitives & IP Cores to your system:
  - Instantiation: The module is EXPLICITELLY added to the system
  - Inference: The module is IMPLICITELLY added to the system





```
Instantiated FlipFlop
(for Microsemi ProAsic3)
```

```
DFN1C1 FlipFlop (
	.D (Input_D),
	.CLK (Clk),
	.CLR (Rst),
	.Q (Output_Q));
```

#### Inferred FlipFlop (Verilog)

```
always @ (posedge Clk or posedge Rst)
begin
    if (Rst)
        Output_Q <= 0;
    else
        Output_Q <= Input_D;
    end</pre>
```

- What does it do?
  - Translates the schematic or HDL code into elementary logic functions
  - Defines the connection of these elementary functions
  - Uses Boolean Algebra and Karnaugh maps to optimize logic functions
- The FPGA design tool optimizes the design during synthesis

It may do undesired changes to the system (e.g. remove modules, change signal names, etc.)!!!

- Always check the synthesis report
  - Warnings & Errors
  - Estimated resource utilization
  - Optimizations
  - And more...
  - And also check the RTL/Technology viewers



Example of Synthesis Report

# FPGA gateware (firmware) design work flow <u>Constraints: Timing</u>

- For a reliable system, the timing requirements for all paths must be provided to the FPGA design tool.
- Provided through constraint files (e.g. Xilinx .XDC, etc..) or GUI (that creates/writes constraint files).
- The most common types of path categories include:
  - Input paths
  - Output paths
  - Register-to-register paths (combinatorial paths)
  - Path specific exceptions (e.g. false path, multi-cycle paths, etc.)
- To efficiently specify these constraints:
  - 1) Begin with global constraints (in many cases with this is enough)
  - 2) Add path specific exceptions as needed
- Over constraining will difficult the routing

#### Example of timing constraint (Xilinx .ucf)



TIMEGRP DATA\_IN OFFSET = IN 1 VALID 3 BEFORE CLK RISING;

# FPGA gateware (firmware) design work flow <u>Constraints: Physical</u>

• Pin planning



#### As previously mentioned... You should do Pin Planning during Specification Stage

#### • Floorplanning

- Try to place logic close to their related I/O pins
- Try to avoid routing across the chip
- Place the Hard IP cores, the related logic will follow
- You can separate the logic by areas (e.g. Xilinx Pblocks)

#### Floorplanning may improve routing times and allow faster system speeds... but too much will difficult the routing!!!



# FPGA gateware (firmware) design work flow Implementation

- The FPGA design tool:
  - 1) Translates the Timing and Physical constraints in order to guide the implementation
  - 2) Maps the synthesized netlist:
    - $\circ$  Logic elements to FPGA logic cells
    - $\circ~$  Hard IP Cores to FPGA hard blocks
    - Verifies that the design can fit the target device
  - 3) Places and Routes (P&R) the mapped netlist:
    - Physical placement of the FPGA logic cells
    - $\circ~$  Physical placement of the FPGA hard blocks
    - $\circ$   $\,$  Routing of the signals through the interconnect network & clock tree  $\,$
- The FPGA design tool may be set for different optimizations (Speed, Area, Power or default)
- Physical Placement & Timing change after re-implementing (use constraints to minimize these changes)
- You should always check the different reports generated during implementation



### **Static Timing Analysis**

- The FPGA design tool analyses the signals propagation delays and clock relationships after P&R
- A timing report is generated, including the paths that did not meet the timing requirements
- Rule of thumb for timing violations:
  - Setup violations: Too long combinatorial paths
  - Hold violations: Issue with CDC and/or Path specific exceptions
- The timing closure flow:

#### **Static Timing Analysis**

- The FPGA design tool analyses the signals propagation delays and clock relationships after P&R
- A timing report is generated, including the paths that did not meet the timing requirements



## **FPGA gateware (firmware) design work flow** Bitstream Generation & FPGA Programming

- Bitstream:
  - Binary file containing the FPGA configuration data
  - Each FPGA vendor has its own bitstream file extension (e.g. .bit (Xilinx), .sof (Altera) )

#### • FPGA programming:

- Bitstream is loaded into the FPGA through JTAG
- Configuration data may be stored in on-board FLASH and loaded by the FPGA at power up
- Multiboot/Safe FPGA configuration

## **FPGA gateware (firmware) design work flow** Bitstream Generation & FPGA Programming

- Bitstream:
  - Binary file containing the FPGA configuration data
  - Each FPGA vendor has its own bitstream file extension (e.g. .bit (Xilinx), .sof (Altera) )
- FPGA programming:
  - Bitstream is loaded into the FPGA through JTAG
  - Configuration data may be stored in on-board FLASH and loaded by the FPGA at power up
  - Multiboot/Safe FPGA configuration



## Simulation

- Event-based simulation to recreate the parallel nature of digital designs
- Verification of HDL modules and/or full systems
- HDL simulators:
  - Most popular: Modelsim
  - Other simulators: Vivado Simulator (Xilinx), Icarus Verilog (Open-source), etc..
- Different levels of simulation
  - Behavioural: simulates only the behaviour of the design
  - Functional: uses realistic functional models for the target technology
     Slow
  - Timing : most accurate. Uses Implemented design after timing analysis Very Slow



#### Example of simulator wave window



Fast

#### In-System Analysers & Virtual I/Os

- Your design is up... and also running?
- Most FPGA vendors provide in-system analyzers & virtual I/Os
- Can be embedded into the design and controlled by JTAG
- Allow monitoring but also control of the FPGA signals
- Minimize interfering with the your system by:

#### Placing extra registers between the monitored signals and the In-System Analyser

- It is useful to spy inside the FPGA... but the issue may come form the rest of the board!!!
- Remember... it is HARDWARE

#### Example of In-System Analyser (Altera SignalTap II)

|                                         | -        | -      |                 |                                                                      | Bus/signal                                                                            | value    |
|-----------------------------------------|----------|--------|-----------------|----------------------------------------------------------------------|---------------------------------------------------------------------------------------|----------|
|                                         |          |        |                 |                                                                      | LATENCY-OPTIMIZED GBT LINK (LOW WHEN STANDARD GBT)                                    | ٩        |
|                                         |          |        |                 |                                                                      | - TX PLL LOCKED                                                                       | •        |
| log: 2006/05/0 click to insert time bar |          |        |                 | - MGT READY                                                          | •                                                                                     |          |
|                                         |          |        |                 | - RX_WORDCLK ALIGNED (ALIGNED TO TX_WORDCLK) (LOW WHEN STANDARD GBT) | •                                                                                     |          |
| Type                                    | Alias I2 | -1     | 0 1 2 3         | 4 5 6 7                                                              | -RX_FRAMECLK ALIGNED (ALIGNED TO TX_FRAMECLK) (LOW WHEN STANDARD GBT)                 | •        |
| 6                                       |          | 0425   |                 | V 1 1                                                                | RX GBT READY                                                                          | •        |
| 9                                       | 1-12     | 013h   | <u>X</u> 012h   | <u> </u>                                                             | ← RX BITSLIP NUMBER                                                                   | 00       |
| 0                                       |          |        |                 |                                                                      | - FPGA_CLKOUT ('0' -> TX_FRAMECLK   '1' -> TX_WORDCLK)                                | 0        |
| 0                                       |          |        |                 |                                                                      | LOOPBACK ('0' -> NORMAL   '2' -> PMA LOOPBACK) (XILINX UG366 PAGE 124)                | 0        |
| -                                       |          |        |                 |                                                                      | - GENERAL RESET                                                                       |          |
| 1                                       |          |        |                 |                                                                      | ENCODING SELECTOR ('0' -> GBT FRAME   '1' -> WIDE-BUS)                                | 0        |
| 6                                       |          | 8CBCh  | X 008Ch X 0009h | X CCBDh X                                                            | ► PATTERN SELECT ('1' -> COUNTER   '2' -> STATIC   others -> NO DATA ERROR DETECTION) | 2        |
| -                                       |          | 000011 |                 |                                                                      | RESET RX GBT READY LOST FLAG                                                          | л        |
| 0                                       | •••      |        |                 |                                                                      | - RX GBT READY LOST FLAG                                                              | <b></b>  |
|                                         |          |        |                 |                                                                      | - RESET DATA ERROR SEEN FLAG                                                          | л        |
| -                                       |          |        |                 |                                                                      | COMMON DATA ERROR SEEN FLAG                                                           | ۲        |
|                                         |          |        |                 |                                                                      | -WIDE-BUS EXTRA DATA ERROR SEEN FLAG                                                  | ٢        |
|                                         |          |        |                 |                                                                      | - ENC8B10B EXTRA DATA ERROR SEEN FLAG                                                 | <b></b>  |
| 0                                       |          |        |                 |                                                                      | TX HEADER SELECTOR ('0' $\rightarrow$ IDLE   '1' $\rightarrow$ DATA)                  | <u> </u> |
|                                         |          |        | 1.<br>          |                                                                      | └─RX HEADER IS DATA FLAG ('0' -> IDLE   '1' -> DATA)                                  | 0        |
|                                         |          |        |                 |                                                                      |                                                                                       |          |
| 1                                       | ŗ.       |        |                 | FFFFh                                                                |                                                                                       | -        |
| 2                                       | ··E      |        | 1<br>4          | TELEN                                                                |                                                                                       | /        |

#### Example of Virtual I/Os (Xilinx VIO)

VIO Console - DEV:0 MyDevice0 (XC7K325T) UNIT:0 MyVIO0 (VIO









#### **Debugging Techniques**

Follow the chain





#### **Debugging Techniques**

Follow the chain





#### **Debugging Techniques**

Follow the chain





#### **Debugging Techniques**

Follow the chain





#### **Debugging Techniques**

Follow the chain





#### **Debugging Techniques**

Follow the chain





#### **Debugging Techniques**

Follow the chain





#### **Debugging Techniques**

Follow the chain

Divide & Conquer





Open the box



#### **Debugging Techniques**

Follow the chain

Divide & Conquer





Open the box



#### We are debugging HARDWARE!!!

## FPGA gateware (firmware) design work flow After debugging...

## FPGA gateware (firmware) design work flow After debugging...



## FPGA gateware (firmware) design work flow After debugging...



## After debugging...

• Documentation



## After debugging...

• Documentation



• Maintenance



## After debugging...

• Documentation

• Maintenance



• ... and maybe User Support



#### Re: GLIB: question on GBT

Sent: Mon 22/07/2013 17:56

To: Manoel Barros Marin

Manoel,

yes, I would love to be included in the GBT-FPGA-users mailing list.

And thanks for the tip about using the GBT-FPGA reference design.

best regards,

Advanced FPGA design ISOTDAQ 2016 @ Rehovot (Israel) 29/01/2016

Outline:

- …from the previous lesson
- Key concepts about HDL
- FPAA gateware (firmware) design work flow

## • Summary



Manoel Barros Marin

#### • FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

#### • Key concepts about FPGA design

- FPGA gateware (firmware) design is NOT programming
- HDL are used for describing HARDWARE
- Timing in FPGA gateware (firmware) design is critical

#### • FPGA gateware (firmware) design flow

- Plan, plan and plan again
- Modular and reusable system
- Coding for synthesis
- Take care of your resets and clocks schemes
- Clock Domain Crossing is tricky
- You must properly constraint your design
- Optimize in your code but also with constraints and FPGA design tool options
- Read the reports (Synthesis, Implementation & Static Timing Analysis)
- Try to be methodic when debugging & use all tools available
- A running system is not the end of the road... (Documentation, Maintenance. User Support)

#### • FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

- Key concepts about FPGA design
  - FPGA gateware (firmware) design is NOT programming
  - HDL are used for describing HARDWARE
  - Timing in FPGA gateware (firmware) design is critical
- FPGA gateware (firmware) design flow
  - Plan, plan and plan again
  - Modular and reusable system
  - Coding for synthesis
  - Take care of your resets and clocks schemes
  - Clock Domain Crossing is tricky
  - You must properly constraint your design
  - Optimize in your code but also with constraints and FPGA design tool options
  - Read the reports (Synthesis, Implementation & Static Timing Analysis)
  - Try to be methodic when debugging & use all tools available
  - A running system is not the end of the road... (Documentation, Maintenance. User Support)



#### • FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

- Key concepts about FPGA design
  - FPGA gateware (firmware) design is NOT programming
  - HDL are used for describing HARDWARE
  - Timing in FPGA gateware (firmware) design is critical
- FPGA gateware (firmware) design flow
  - Plan, plan and plan again
  - Modular and reusable system
  - Coding for synthesis
  - Take care of your resets and clocks schemes
  - Clock Domain Crossing is tricky
  - You must properly constraint your design
  - Optimize in your code but also with constraints and FPGA design tool options
  - Read the reports (Synthesis, Implementation & Static Timing Analysis)
  - Try to be methodic when debugging & use all tools available
  - A running system is not the end of the road... (Documentation, Maintenance. User Support)



...for Geeks



#### • FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

- Key concepts about FPGA design
  - FPGA gateware (firmware) design is NOT programming
  - HDL are used for describing HARDWARE
  - Timing in FPGA gateware (firmware) design is critical
- FPGA gateware (firmware) design flow
  - Plan, plan and plan again
  - Modular and reusable system
  - Coding for synthesis
  - Take care of your resets and clocks schemes
  - Clock Domain Crossing is tricky
  - You must properly constraint your design
  - Optimize in your code but also with constraints and FPGA design tool options
  - Read the reports (Synthesis, Implementation & Static Timing Analysis)
  - Try to be methodic when debugging & use all tools available
  - A running system is not the end of the road... (Documentation, Maintenance. User Support)



Where do I find more info about this?? There are nice papers & books but... FPGA vendors provide very good documentation about all topics mentioned in this lecture



## Acknowledges

- Markus Joos (CERN) & organisers of ISOTDAQ-16
- Andrea Borga (NIKHEF), Torsten Alt (FIAS) for their contribution to this lecture
- Rhodri Jones, Thibaut Lefevre, Andrea Boccardi & other colleagues from CERN BE-BI-QP

# Any Questions?

