ISOTDAQ 2017 NikheF (Amsterdam) 31/01/2017



**Manoel Barros Marin** 

ISOTDAQ 2017 NikheF (Amsterdam) 31/01/2017

#### Outline:

- ... from the previous lesson
- Key concepts about FPGA design
- FPGA gateware design work flow
- Summary



# ISOTDAQ 2017 NikheF (Amsterdam) 31/01/2017

#### Outline:

- ... from the previous lesson
- Key concepts about HDL
- FPGA gateware design work flow
- · Summary



What is an Field Programmable Gate Array (FPGA)?

#### What is an Field Programmable Gate Array (FPGA)?

#### FPGA - Wikipedia

https://en.wikipedia.org/wiki/Field-programmable\_gate\_array

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

#### What is an Field Programmable Gate Array (FPGA)?

#### FPGA - Wikipedia

https://en.wikipedia.org/wiki/Field-programmable\_gate\_array

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".



#### • FPGA fabric (matrix like structure) made of:

- I/O-cells to communicate with outside world
- Logic cells
  - Look-Up-Table (LUT) to implement combinatorial logic
  - Flip-Flops (D) to implement sequential logic
- Interconnect network between logic resources
- Clock tree to distribute the clock signals





#### But it also features Hard Blocks:



# ISOTDAQ 2017 NikheF (Amsterdam) 31/01/2017

#### Outline:

- ...from the previous lesson
- Key concepts about HDL
- FPGA gateware design work flow
- · Summary



## Key concepts about FPGA design

FPGA gateware design is NOT programming



## Key concepts about FPGA design

#### FPGA gateware design is NOT programming



Sequential Processing Single Core CPU Core CPU Core CPU Processing Single CPU Processing Single CPU Processing Single CPU Process

#### Programming

- Code is written and translated into instructions.
- Instructions are executed sequentially by the CPU(s)
- Parallelism is achieved by running instructions on multiple threads/cores
- Processing structures and instructions sets are fixed by the architecture of the system

VS.

#### • FPGA gateware design

- No fixed architecture, the system is built according to the task
- Building is done by describing/defining system elements and their relations
- Intrinsically parallel, sequential behaviour is achieved by registers and Finite-State-Machines (FSMs)
- Description done by schematics or a hardware description language (HDL)

HDL are used for describing <u>HARDWARE</u>



## HDL are used for describing <u>HARDWARE</u>



• Example of a WAIT statement (Programming Language VS. HDL)

## HDL are used for describing <u>HARDWARE</u>



- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)

```
sleep(5); // sleep 5 seconds
```

#### HDL are used for describing **HARDWARE**



- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)

```
sleep(5); // sleep 5 seconds
```

- In HDL (e.g. VHDL):
  - Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```



#### HDL are used for describing **HARDWARE**



- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)

```
sleep(5); // sleep 5 seconds
```

- In HDL (e.g. VHDL):
  - Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```



```
simple delay counter : process (delay rst, delay clk, delay ena)
begin -- process
 if delay rst = '1' then
   s count <= delay ld value;
   s delay done <= '0';
 elsif rising edge(delay clk) then
   if delay ena = '1' then
     if delay ld = '1' then
       s count <= delay ld value;
       s_count <= s_count - 1;
     end if;
   end if:
   if s count = 0 then
     s delay done <= '1';
     s delay done <= '0';
   end if;
 end if;
end process;
```



#### HDL are used for describing **HARDWARE**



- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)

```
sleep(5); // sleep 5 seconds
```

- In HDL (e.g. VHDL):
  - Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```



```
simple delay counter : process (delay rst, delay clk, delay ena)
begin -- process
 if delay rst = '1' then
   s count <= delay ld value;
   s delay done <= '0';
 elsif rising edge(delay clk) then
   if delay ena = '1' then
     if delay ld = '1' then
       s count <= delay ld value;
       s_count <= s_count - 1;
     end if:
   end if:
   if s count = 0 then
     s delay done <= '1';
     s delay done <= '0';
   end if:
 end if;
end process;
                   HDL to RTL
```







#### Register Transfer Level (RTL)

http://en.wikipedia.org/wiki/Register-transfer\_level

A design abstraction which models a synchronous digital circuit in terms of the flow of digital signals (data) between registers and logical operations performed on those signals

#### HDL are used for describing **HARDWARE**



- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)

```
sleep(5); // sleep 5 seconds
```

- In HDL (e.g. VHDL):
  - Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```



```
SystemVerilog Verilog atinn)
```

```
simple delay counter : process (delay rst, delay clk, delay ena)
begin -- process
 if delay rst = '1' then
               <= delay_ld value;
   s_delay_done <= '0';</pre>
 elsif rising edge(delay clk) then
   if delay ena = '1' then
     if delay ld = '1' then
       s count <= delay ld value;
        s_count <= s_count - 1;
     end if:
   end if:
   if s count = 0 then
     s delay done <= '1';
     s delay done <= '0';
   end if:
 end if:
end process;
                    HDL to RTL
```

```
∑ Project Summary × MRTL Schematic × M Delay.vhd ×
                                                                       Elaborated (RTL) Design
   15 Cells 13 I/O Ports 64 Nets
                                                      s_count_i
                                minusOp_i
                                                        TRTL MUX
           delay_ld 🗅
                                                     s count0 i
                                                        TRTL MUX
          delay_rst 🗀
          delay_clk 🕞
                                                                                                                                  delay_done
                                                                             s_count_reg(7:0)
                                                                                                               TL REG ASYNC
         delay_ena
   delay_ld_value[7:0]
                                                    s count0 i 0
                              s_count1
                                                                             RTL REG ASYNC
                               RTL INV
                                                        FRTL MUX
                                                                    counter Flip-Flops
                                   counter control
                                                                                                            registered output
```

#### HDL are used for describing **HARDWARE**



- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)

```
sleep(5); // sleep 5 seconds
```

- In HDL (e.g. VHDL):
  - Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```

Synthesizable (for simulation and/or FPGA implementation)





#### HDL are used for describing **HARDWARE**



- Example of a WAIT statement (Programming Language VS. HDL)
  - In programming language (e.g. C) (Unix, #include <unistd.h>)

```
sleep(5); // sleep 5 seconds
```

- In HDL (e.g. VHDL):
  - Not synthesizable (only for simulation test benches)

```
wait for 5 sec; -- handy for TB clocks
```

Synthesizable (for simulation and/or FPGA implementation)





Timing in FPGA gateware design is critical



### Timing in FPGA gateware design is critical



• Data propagates in the form of electrical signals through the FPGA



### Timing in FPGA gateware design is critical



Data propagates in the form of electrical signals through the FPGA



### Timing in FPGA gateware design is critical



Data propagates in the form of electrical signals through the FPGA



If these signals do not arrive to their destination on time...

When designing FPGA gateware you have to think



When designing FPGA gateware you have to think



# ISOTDAQ 2017 @ NikheF (Amsterdam) 31/01/2017

#### Outline:

- ...from the previous lesson
- Key concepts about HDL
- FPGA gateware design work flow
- · Summary





**Project Specification** 

This is the most critical step...

The rest of the design process is based on it!!!



**Project Specification** 

This is the most critical step...

The rest of the design process is based on it!!!



**Project Specification** 

This is the most critical step...

Gather requirements from the users
 The rest of the design process is based on it!!!

#### **Project Specification**

#### This is the most critical step...

- Gather requirements from the users
   The rest of the design process is based on it!!!
- Specify:
  - Target application (General purpose or Specific)

#### Example of General Purpose Gateware



#### **Project Specification**

#### This is the most critical step...

- Gather requirements from the users
   The rest of the design process is based on it!!!
- Specify:
  - Target application (General purpose or Specific)

#### Example of Application Specific Gateware



#### **Project Specification**

#### This is the most critical step...

- Gather requirements from the users
- The rest of the design process is based on it!!!

- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)



#### **Project Specification**

#### This is the most critical step...

- Specify:
  - Target application (General purpose or Specific)

Small FPGA vendors may target specific markets

(e.g. Microsemi offers high reliable FPGAs, etc..)

Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)

Xilinx FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.) 2369.45 49% XILINX ATTERA ALL PROGRAMMABLE THE now part of Intel QuickLogic 26.2 1% Microsemi 207.49 4% Lattice Semi QuickLogic 297.77 6% Altera

1954.43

40%

FPGA Market Share by 2010

35

in Millions of USD

#### **Project Specification**

#### This is the most critical step...

- Gather requirements from the users
- The rest of the design process is based on it!!!

- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
  - Electronic board (Custom or COTS (\*))

Example of Custom Board





### **Project Specification**

### This is the most critical step...

- - Specify:
    - Target application (General purpose or Specific)
    - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
    - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
    - Electronic board (Custom or COTS (\*))
    - Development tools (FPGA vendor or Commercial)

### Example of FPGA Vendor Tools



Example of Commercial Tools



## **Project Specification**

## This is the most critical step...



- Gather requirements from the users
   The rest of the design process is based on it!!!
- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
  - Electronic board (Custom or COTS (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)

## **Project Specification**

### This is the most critical step...



- Gather requirements from the users
- The rest of the design process is based on it!!!

- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
  - Electronic board (Custom or COTS (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)



### **Project Specification**

### This is the most critical step...



Gather requirements from the users

The rest of the design process is based on it!!!

- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
  - Electronic board (Custom or COTS (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)





### **Project Specification**

## This is the most critical step...



Gather requirements from the users

The rest of the design process is based on it!!!

- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
  - Electronic board (Custom or COTS (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)





## **Project Specification**

### This is the most critical step...



Gather requirements from the users

The rest of the design process is based on it!!!

- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
  - Electronic board (Custom or COTS (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)
  - Design language (Schematics or HDL (e.g. VHDL, etc.))

HDL are the most popular for RTL design but...

Schematics may be better in some cases (e.g. SoC bus interconnect, etc..)

### Examples of Design Languages



## **Project Specification**

## This is the most critical step...



• Gather requirements from the users

### The rest of the design process is based on it!!!

- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
  - Electronic board (Custom or COTS (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)
  - Design language (Schematics or HDL (e.g. VHDL, etc.))
  - Coding convention

### Example of Coding Convention

| Your code |
|-----------|
| should be |
| readable  |

| description     | extension | example  |
|-----------------|-----------|----------|
| variable        | prefix v  | v_Buffer |
| alias           | prefix a  | a_Bit5   |
| constant        | prefix c  | c_Lenght |
| type definition | prefix t  | t_MyType |
| generics        | prefix g  | g_Width  |
|                 |           |          |

### **Project Specification**

### This is the most critical step...



• Gather requirements from the users

### The rest of the design process is based on it!!!

Specify:

- Target application (General purpose or Specific)
- Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
- FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
- Electronic board (Custom or COTS (\*))
- Development tools (FPGA vendor or Commercial)
- Optimization (Speed, Area, Power or default)
- Design language (Schematics or HDL (e.g. VHDL, etc.))
- Coding convention
- Software interface (GUI, Scripts or both)



Example of GUIs

Example of TCL script



## **Project Specification**

### This is the most critical step...



• Gather requirements from the users he res

The rest of the design process is based on it!!!

- Specify:
  - Target application (General purpose or Specific)
  - Main features (e.g. System bus, SoC, Multi-gigabit transceivers, etc.)
  - FPGA vendor (e.g. Xilinx, Intel (Altera), Microsemi (Actel), Lattice, etc.)
  - Electronic board (Custom or COTS (\*))
  - Development tools (FPGA vendor or Commercial)
  - Optimization (Speed, Area, Power or default)
  - Design language (Schematics or HDL (e.g. VHDL, etc.))
  - Coding convention
  - Software interface (GUI, Scripts or both)
  - Use of files repository (SVN, GIT, etc.. or none)





**Project Specification** 

## This is the most critical step...



• Block diagram of the system

The rest of the design process is based on it!!!

- Include the FPGA logic...
- ... but also the on-board devices and related devices
- May combine different abstraction levels

Example of system block diagram



**Project Specification** 

## This is the most critical step...



• Pin planning

The rest of the design process is based on it!!!

Pin assignments are one type of Location Constraints

## Critical for Custom Boards!!!





## **Design Entry**



### Design Entry: Modularity & Reusability

### Your system should be Modular

- Design at RTL level (think hard...ware)
- Well defined clocks and resets schemes
- Separated Data & Control paths
- Multiple instantiations

#### Good example of Modular System 8-bit Clock 8-bit Counter Address Reset Increment RAM 256x16-hit Pattern **Data Valid Flag** Write Enable Generator 16-bit Data Data Reset Reset Reset

#### Your code should be Reusable

- Add primitives (and modules) to the system by inference when possible
- Use parameters in your code (e.g. generics in VHDL, parameters in Verilog, etc.)
- Centralise parameters in external files (e.g. packages in VHDL, headers in Verilog, etc.)
- Use configurable modules interfaces when possible (e.g. parametrised vectors, records in VHDL, etc.)
- Use standard features (e.g. 12C, Wishbone, etc.)
- Use standard IP Cores (e.g. from www.OpenCores.org, etc.)
- Avoid vendor specific IP Cores when possible
- Talk with your colleagues and see what other FPGA designers are doing

### **Design Entry: Coding for Synthesis**

# Synthesizable code is intended for FPGA implementation

Use non-synthesizable HLD statements only in simulation test benches

A fundamental guiding principle when coding for synthesis is to minimize, if not eliminate, all structures and directives that could potentially create a mismatch between simulation and synthesis.

From book "Advanced FPGA Design" by Steve Kilts (Copyright © 2007 John Wiley & Sons, Inc.)

• The RTL synthesis tool is expecting a synchronous design...

### **Design Entry: Coding for Synthesis**

# Synthesizable code is intended for FPGA implementation

Use non-synthesizable HLD statements only in simulation test benches

A fundamental guiding principle when coding for synthesis is to minimize, if not eliminate, all structures and directives that could potentially create a mismatch between simulation and synthesis.

From book "Advanced FPGA Design" by Steve Kilts (Copyright © 2007 John Wiley & Sons, Inc.)

• The RTL synthesis tool is expecting a synchronous design...

But what is a synchronous design???



### **Design Entry: Coding for Synthesis**

# Synthesizable code is intended for FPGA implementation

• Use non-synthesizable HLD statements only in simulation test benches

A fundamental guiding principle when coding for synthesis is to minimize, if not eliminate, all structures and directives that could potentially create a mismatch between simulation and synthesis.

From book "Advanced FPGA Design" by Steve Kilts (Copyright © 2007 John Wiley & Sons, Inc.)

• The RTL synthesis tool is expecting a <u>synchronous design</u>...

Synchronous design is the one compose by combinatorial logic (e.g. logic gates, multiplexors, etc..) and sequential logic (registers that are triggered on the edge of a single clock),



### **Design Entry: Coding for Synthesis**

# Synthesizable code is intended for FPGA implementation

Use non-synthesizable HLD statements only in simulation test benches

A fundamental guiding principle when coding for synthesis is to minimize, if not eliminate, all structures and directives that could potentially create a mismatch between simulation and synthesis.

From book "Advanced FPGA Design" by Steve Kilts (Copyright © 2007 John Wiley & Sons, Inc.)

• The RTL synthesis tool is expecting a synchronous design...

Synchronous design is the one compose by combinatorial logic (e.g. logic gates, multiplexors, etc..) and sequential logic (registers that are triggered on the edge of a single clock),



### **Design Entry: Coding for Synthesis**

- Combinatorial logic coding rules
  - Sensitivity list must include ALL input signals
     Not respecting this may lead to non responsive outputs under changes of input signals
  - ALL output signals must be assigned under ALL possible input conditions
     Not respecting this may lead to undesired latches (asynchronous storage element)
  - No feedback from output to input signals
     Not respecting this may lead to unknown output states (metastability) & undesired latches

### Good combinatorial coding for synthesis





# process(Input\_A,Input\_B,Input\_C) begin Output\_nand <= Input\_A nand Input\_B; Output\_nor <= Input\_A nor Input\_B; - Output\_Q <= Output\_nand and Input\_C and Output\_nor; end process;</pre>

#### Bad combinatorial coding for synthesis



### **Design Entry: Coding for Synthesis**

- Sequential logic coding rules
  - Only clock signal (and asynchronous set/reset signals when used) in sensitivity list
     Not respecting this may produce undesired combinatorial logic
  - All registers of the sequence must be triggered by the same clock edge (either Rising or Falling)
     Not respecting this may lead to metastability at the output of the registers
  - Include all registers of the sequence in the same reset branch
     Not respecting this may lead to undesired register values after reset

#### Good sequential coding for synthesis

```
process(Clk,Rst)
begin
    if (Rst = '1') then
        Output_Out <= '0';
        Output_Q <= '0';
    elsif rising_edge(Clk) then
        Output_Out <= Output_Q;
        Output_Q <= Input_In;
    end if;
end process;</pre>
```



### Bad sequential coding for synthesis

```
process(Clk,Rst,Input_In)
begin

if (Rst = '1') then

Output_Out <= '0';
elsif riving_edge(Clk) then

Output_Out <= Output_Q;
Output_Q <= Input_In;
end if;
end process;</pre>
```

### **Design Entry: Coding for Synthesis**

- Synchronous design coding rules:
  - FULLY synchronous design
    - No combinatorial feedback
    - No asynchronous latches

Not respecting this may lead to incorrect analysis from the FPGA design tool

- Register ALL output signals (input signals also recommended)
   Not respecting this may lead to uncontrolled length of combinatorial paths
- Properly design of reset scheme (mentioned later)
   Not respecting this may lead to undesired register values after reset
- Properly design of clocking scheme (mentioned later)
   Not respecting this may lead to metastability at the output of the registers & Misuse of resources
- Properly handle Clock Domain Crossings (CDC) (mentioned later)
   Not respecting this may lead to metastability at the output of the registers



### **Design Entry: Coding for Synthesis**

- Finite State Machines (FSMs):
  - Digital logic circuit with a finite number of internal states
  - Widely used for system control
  - Two variants of FSM
    - Moore: Outputs depends only on the current state of the FSM
    - Mealy: Outputs depends only on the current state of the FSM as well as the current values of the inputs

Button Pressed

- Modelled by State Transition Diagrams

  Always

  Button Not Pressed

  Button Not State

  Button Not Pressed

  Button Not Pressed

  Button Not Pressed

  Button Not Pressed
- Many different FSM coding styles (But not all of the are good!!)
- FSM coding considerations:
  - Synchronize inputs & outputs
  - O Dutputs may be assigned during states or state transitions
  - Be careful with unreachable/illegal states
  - You can add counters to FSMs.

## Design Entry: Reset Scheme A bad reset scheme may get you crazy!!!

- Used to initialize the output of the registers to a know state
- It has a direct impact on:
  - Performance
  - Logic utilization
  - Reliability
- Different approaches:
  - Asynchronous

**Pros:** No free running clock required, easier timing closure

Cons: skew, glitches, simulation mismatch, difficult to debug, extra constraints, etc.

Synchronous

Pros: No Skew, No Glitches, No simulation mismatch, Easier to debug, No extra constraints, etc...

Cons: Free-running clock required, More difficult timing closure

No Reset Scheme

Pros; Easier Routing, Less resources, Easiest timing closure

Cons: Only reset at power up (in some devices not even that...) <- In fact, reset is not always needed

Hybrid: Usually in big designs (Avoid when possible!!!)

### Design Entry: Reset Scheme

A bad reset scheme may get you crazy!!!

- Used to initialize the output of the registers to a know state
- It has a direct impact on:
  - Performance
  - Logic utilization
  - Reliability
- Different approaches:
  - Asynchronous

**Pros:** No free running clock required, easier timing closure

Cons: skew, glitches, simulation mismatch, difficult to debug, extra constraints, etc.

Synchronous

**Pros:** No Skew, No Glitches, No simulation mismatch, Easier to debug, No extra constraints, etc...

Cons: Free-running clock required, More difficult timing closure

No Reset Scheme

Pros; Easier Routing, Less resources, Easiest timing closure

Cons: Only reset at power up (in some devices not even that...) <- In fact, reset is not always needed

Hybrid: Usually in big designs (Avoid when possible!!!)

My advise is...
You should use
SYNCHRONOUS RESET
by default

### Design Entry: Clocks Scheme

### Clocking resources are very precious!!!

- Clock regions
- Clock trees (Global & Local)
- Other FPGA clocking resources
  - Clock capable pins
  - Clock buffers
  - Clock Multiplexors
  - PIIs & DCM



Bad practices when designing your clocking scheme



### Design Entry: <u>Timing</u>



No Stable Data (Metastable Area)

## **Design Entry:** <u>Timing</u>

Clock Domain Crossing (CDC)



### **Design Entry: Timing**

Clock Domain Crossing (CDC)



### **Design Entry: Timing**

- Clock Domain Crossing (CDC): The problem...
  - Clock Domain Crossing (CDC): passing a signal from one clock domain to another (A to B)
  - If clocks are unrelated to each other (asynchronous) timing analysis is not possible
  - Setup and Hold times of FlipFlop B are likely to be violated -> Metastability!!!



### **Design Entry: Timing**

Clock Domain Crossing: <u>The workaround...</u>



## Design Entry: Primitives & IP Cores

- Primitives: Basic components of the FPGA
  - Vendor (and device) specific
  - Examples: Buffers (I/O & Clock), Registers, BRAMs, DSP blocks, Logic Gates (programed LUTs)
- Hard IP Cores: Complex hardware blocks embedded into the FPGA
  - Vendor (and device) specific
  - Fixed I/O location
  - In many cases they may be set through GUI (Wizards)
  - Examples: : PLLs, Multi-gigabit Transceivers, Ethernet MAC, Microprocessors, etc...
- Soft IP Cores: Complex (or simple) modules ready to be implemented
  - They may be vendor specific or agnostic:
    - Vendor Specific: Encrypted Code or Requires Hard IP Core
    - Vendor Agnostic: Commercial or Open Source (www.OpenCores.org)
  - In many cases they may be set through GUI (Wizards)
  - Examples: : All kind of modules
- Two ways of adding Primitives & IP Cores to your system:
  - <u>Instantiation:</u> The module is EXPLICITLY added to the system
  - Inference: The module is IMPLICITLY added to the system

### Instantiated FlipFlop (for Microsemi ProAsic3)

```
DFN1C1 FlipFlop (
   .D (Input_D),
   .CLK (Clk),
   .CLR (Rst),
   .Q (Output_Q));
```

### Inferred FlipFlop (Verilog)

```
always @ (posedge Clk or posedge Rst)
begin
    if (Rst)
        Output_Q <= 0;
    else
        Output_Q <= Input_D;
    end</pre>
```

## Design Entry: Primitives & IP Cores

- Primitives: Basic components of the FPGA
  - Vendor (and device) specific
  - Examples: Buffers (I/O & Clock), Registers, BRAMs, DSP blocks, Logic Gates (programed LUTs)
- Hard IP Cores: Complex hardware blocks embedded into the FPGA
  - Vendor (and device) specific
  - Fixed I/O location
  - In many cases they may be set through GUI (Wizards)
  - Examples: : PLLs, Multi-gigabit Transceivers, Ethernet MAC, Microprocessors, etc...
- Soft IP Cores: Complex (or simple) modules ready to be implemented
  - They may be vendor specific or agnostic:
    - Vendor Specific: Encrypted Code or Requires Hard IP Core
    - Vendor Agnostic: Commercial or Open Source (www.OpenCores.org)
  - In many cases they may be set through GUI (Wizards)
  - Examples: : All kind of modules
- Two ways of adding Primitives & IP Cores to your system:
  - <u>Instantiation</u>: The module is EXPLICITLY added to the system
  - Inference: The module is IMPLICITLY added to the system

### Instantiated FlipFlop (for Microsemi ProAsic3)

```
DFN1C1 FlipFlop (
   .D (Input_D),
   .CLK (Clk),
   .CLR (Rst),
   .Q (Output_Q));
```

### Inferred FlipFlop (Verilog)

Add Primitives by Inference

```
always @ (posedge Clk or posedge Rst)
begin
    if (Rst)
        Output_Q <= 0;
    else
        Output_Q <= Input_D;
    end</pre>
```

## Design Entry: Primitives & IP Cores

- Primitives: Basic components of the FPGA
  - Vendor (and device) specific
  - Examples: Buffers (I/O & Clock), Registers, BRAMs, DSP blocks, Logic Gates (programed LUTs)
- Hard IP Cores: Complex hardware blocks embedded into the FPGA
  - Vendor (and device) specific
  - Fixed I/O location
  - In many cases they may be set through GUI (Wizards)
  - Examples: : PLLs, Multi-gigabit Transceivers, Ethernet MAC, Microprocessors, etc...
- Soft IP Cores: Complex (or simple) modules ready to be implemented
  - They may be vendor specific or agnostic:
    - Vendor Specific: Encrypted Code or Requires Hard IP Core
    - Vendor Agnostic: Commercial or Open Source (www.OpenCores.org)
  - In many cases they may be set through GUI (Wizards)
  - Examples: : All kind of modules
- Two ways of adding Primitives & IP Cores to your system:
  - <u>Instantiation:</u> The module is EXPLICITLY added to the system
  - Inference: The module is IMPLICITLY added to the system

Add Primitives by Inference

Add IP Cores by Instantiation (and use the Wizard if possible)

### Instantiated FlipFlop (for Microsemi ProAsic3)

```
DFN1C1 FlipFlop (
   .D (Input_D),
   .CLK (Clk),
   .CLR (Rst),
   .Q (Output_Q));
```

### Inferred FlipFlop (Verilog)

```
always @ (posedge Clk or posedge Rst)
begin
    if (Rst)
        Output_Q <= 0;
    else
        Output_Q <= Input_D;
    end</pre>
```

### **Synthesis**

- What does it do?
  - Translates the schematic or HDL code into elementary logic functions
  - Defines the connection of these elementary functions
  - Uses Boolean Algebra and Karnaugh maps to optimize logic functions
- The FPGA design tool optimizes the design during synthesis

It may do undesired changes to the system (e.g. remove modules, change signal names, etc.)!!!

- Always check the synthesis report
  - Warnings & Errors
  - Estimated resource utilization
  - Optimizations
  - And more...

And also check the RTL/Technology viewers

Example of Synthesis Report



### **Constraints:** Timing

- For a reliable system, the timing requirements for all paths must be provided to the FPGA design tool.
- Provided through constraint files (e.g. Xilinx .XDC, etc..) or GUI (that creates/writes constraint files).
- The most common types of path categories include:
  - Input paths
  - Output paths
  - Register-to-register paths (combinatorial paths)
  - Path specific exceptions (e.g. false path, multi-cycle paths, etc.)
- To efficiently specify these constraints:
  - 1) Begin with global constraints (in many cases with this is enough)
  - 2) Add path specific exceptions as needed
- Over constraining will difficult the routing

Example of timing constraint (Xilinx .ucf)



TIMEGRP DATA IN OFFSET = IN 1 VALID 3 BEFORE CLK RISING;

### **Constraints:** Physical

### Pin planning



As previously mentioned... You should do Pin Planning during Specification Stage

### • Floorplanning

- Try to place logic close to their related I/O pins
- Try to avoid routing across the chip
- Place the Hard IP cores, the related logic will follow
- You can separate the logic by areas (e.g. Xilinx Pblocks)

Floorplanning may improve routing times and allow faster system speeds... but too much will difficult the routing!!!



### **Implementation**

- The FPGA design tool:
  - l) Translates the Timing and Physical constraints in order to guide the implementation
  - 2) Maps the synthesized netlist:
    - Logic elements to FPGA logic cells
    - Hard IP Cores to FPGA hard blocks
    - Verifies that the design can fit the target device
  - 3) Places and Routes (P&R) the mapped netlist:
    - Physical placement of the FPGA logic cells
    - Physical placement of the FPGA hard blocks
    - Routing of the signals through the interconnect network & clock tree
- The FPGA design tool may be set for different optimizations (Speed, Area, Power or default)
- Physical Placement & Timing change after re-implementing (use constraints to minimize these changes)
- You should always check the different reports generated during implementation



## **Static Timing Analysis**

- The FPGA design tool analyses the signals propagation delays and clock relationships after P&R
- A timing report is generated, including the paths that did not meet the timing requirements
- Rule of thumb for timing violations:



- Setup violations: Too long combinatorial paths
- Hold violations: Issue with CDC and/or Path specific exceptions
- The timing closure flow:

## **Static Timing Analysis**

The FPGA design tool analyses the signals propagation delays and clock relationships after P&R



Re-implementation

74

## Bitstream Generation & FPGA Programming

#### Bitstream:

- Binary file containing the FPGA configuration data
- Each FPGA vendor has its own bitstream file extension (e.g. .bit (Xilinx), .sof (Altera) )

#### FPGA programming:

- Bitstream is loaded into the FPGA through JTAG
- Configuration data may be stored in on-board FLASH and loaded by the FPGA at power up
- Remote programming (e.g. through Ethernet)
- Multiboot/Safe FPGA configuration

## Bitstream Generation & FPGA Programming

#### Bitstream:

- Binary file containing the FPGA configuration data
- Each FPGA vendor has its own bitstream file extension (e.g. .bit (Xilinx), .sof (Altera) )

#### • FPGA programming:

- Bitstream is loaded into the FPGA through JTAG
- Configuration data may be stored in on-board FLASH and loaded by the FPGA at power up
- Remote programming (e.g. through Ethernet)
- Multiboot/Safe FPGA configuration



Multiboot/Safe FPGA configuration diagrams



## Simulation

- Event-based simulation to recreate the parallel nature of digital designs
- Verification of HDL modules and/or full systems
- HDL simulators:
  - Most popular: Modelsim
  - Other simulators: Vivado Simulator (Xilinx), Icarus Verilog (Open-source), etc.
- Different levels of simulation
  - Behavioural: simulates only the behaviour of the design
  - Functional: uses realistic functional models for the target technology
  - Timing: most accurate. Uses Implemented design after timing analysis
     Very Slow



#### Example of simulator wave window



## In-System Analysers & Virtual I/Os

- Your design is up... and also running?
- Most FPGA vendors provide in-system analyzers & virtual I/Os
- Can be embedded into the design and controlled by JTAG
- Allow monitoring but also control of the FPGA signals
- Minimize interfering with the your system by:

Placing extra registers between the monitored signals and the In-System Analyser

• It is useful to spy inside the FPGA... but the issue may come form the rest of the board!!!

• Remember... it is HARDWARE Example of Virtual 1/0s (Xilinx VIO)



## **Debugging Techniques**

## **Debugging Techniques**



## **Debugging Techniques**





## **Debugging Techniques**

## Divide & Conquer



## **Debugging Techniques**

## Divide & Conquer





## **Debugging Techniques**

## Divide & Conquer





## **Debugging Techniques**

## Divide & Conquer





## **Debugging Techniques**

## Divide & Conquer





## **Debugging Techniques**

## Divide & Conquer





## **Debugging Techniques**

## Divide & Conquer





## **Debugging Techniques**

## Divide & Conquer





## **Debugging Techniques**

## Divide & Conquer





## **Debugging Techniques**

## Divide & Conquer



#### Follow the chain



## Open the box



## **Debugging Techniques**

## Divide & Conquer



#### Follow the chain



## Open the box



We are debugging HARDWARE!!! 🛭

After debugging...

## After debugging...



## After debugging...



## After debugging...

Documentation



## After debugging...

Documentation

Maintenance





## After debugging...

Documentation

Maintenance



... and maybe User Support





# Advanced FPGA design

# ISOTDAQ 2017 NikheF (Amsterdam) 31/01/2017

## Outline:

- ...from the previous lesson
- Key concepts about HDL
- FPGA gateware design work flow
- Summary



## FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

...for Geeks

#### FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

#### • Key concepts about FPGA design

- FPGA gateware design is NOT programming
- HDL are used for describing HARDWARE
- Timing in FPGA gateware design is critical



#### FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

## Key concepts about FPGA design

- FPGA gateware design is NOT programming
- HDL are used for describing HARDWARE
- Timing in FPGA gateware design is critical

#### FPGA gateware design flow

- Plan, plan and plan again
- Modular and reusable system
- Coding for synthesis
- Take care of your resets and clocks schemes
- Clock Domain Crossing is tricky
- You must properly constraint your design
- Optimize in your code but also with constraints and FPGA design tool options
- Read the reports (Synthesis, Implementation & Static Timing Analysis)
- Try to be methodic when debugging & use all tools available
- A running system is not the end of the road... (Documentation, Maintenance. User Support)



#### FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

## Key concepts about FPGA design

- FPGA gateware design is NOT programming
- HDL are used for describing HARDWARE
- Timing in FPGA gateware design is critical

#### FPGA gateware design flow

- Plan, plan and plan again
- Modular and reusable system
- Coding for synthesis
- Take care of your resets and clocks schemes
- Clock Domain Crossing is tricky
- You must properly constraint your design
- Optimize in your code but also with constraints and FPGA design tool options
- Read the reports (Synthesis, Implementation & Static Timing Analysis)
- Try to be methodic when debugging & use all tools available
- A running system is not the end of the road... (Documentation, Maintenance. User Support)

But it works ©



#### FPGA - Wikipedia

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence "field-programmable".

#### Key concepts about FPGA design

- FPGA gateware design is NOT programming
- HDL are used for describing HARDWARE
- Timing in FPGA gateware design is critical

#### FPGA gateware design flow

- Plan, plan and plan again
- Modular and reusable system
- Coding for synthesis
- Take care of your resets and clocks schemes
- Clock Domain Crossing is tricky
- You must properly constraint your design
- Optimize in your code but also with constraints and FPGA design tool options
- Read the reports (Synthesis, Implementation & Static Timing Analysis)
- Try to be methodic when debugging & use all tools available
- A running system is not the end of the road... (Documentation, Maintenance. User Support)

Where do I find more info about this??
There are nice papers & books but...
FPGA vendors provide very good
documentation about all topics
mentioned in this lecture

## Acknowledges

- Markus Joos (CERN) & other organisers of ISOTDAQ-17
- Andrea Borga (NikheF), Torsten Alt (FIAS) for their contribution to this lecture
- Rhodri Jones, Thibaut Lefevre, Andrea Boccardi & other colleagues from CERN BE-BI-QP

# Any Question?

