### Common building blocks: FPGA testing

Antonio Scialdone Rudy Ferraro Salvatore Danzeca BE/CEM-EPR





### Impact of FPGA radiation testing



#### Guidelines

Definition of guidelines for testing future FPGAs so that standardized testing can be performed



#### **Future adoption**

Already qualified FPGAs can be later selected to be used for CERN applications



#### **User's burden reduction**

Reducing the burden of test from the end users by providing a selection of tested FPGAs



**e**n

#### **Purchase**

Big quantities of FPGAs can be purchased at once if the test results are good



#### Cost and time reduction

Beam time is expensive and adopting a single test framework helps reducing the time necessary to qualify the device, thus the final cost





02/03/2022

### **FPGA** internal structure

FPGAs are made of predefined resources, or Functional Elements (FE), with programmable interconnection allowing to implement reconfigurable digital circuits.

**CLBs** 

LUT

**DSPs** 

REG

REG

REG

REG

FF

.



Simplified top-level view of an FPGA





REG

### PLLs

Data storage

Giving clock management and clock synthesis capabilities

Memory blocks

### lOs

To interface the FPGA with external components

#### Interconnects

Connect all the resources to implement the desired user design. They are configurable through the <u>Configuration memory</u>

### **Challenges of FPGA testing**

1

#### **Device complexity**

The device contains many components that require a different test to evaluate their radiation responses individually.



#### **Failure rate estimation**

Estimate the failure rate of the device when using a custom application, avoid repeating the test for each application, and comparing the results with other FPGAs and organizations.



#### **Test procedure**

Controls

Electronics &

Mechatronics

**E**N

Perform the test in real conditions, considering high-frequency circuits, dynamic inputs, and keep a good level of observability inside the design.





### Radiation effects on FPGAs - SEE

Single Event Upset (SEU)



Single Event Transient (SET)



Single Event Functional Interrupt

(SEFI)







Controls Electronics &

Mechatronics

 $(\bigcirc$ 

R2E

ên



### Functional elements (FEs) test circuit

#### **FFs**



#### **DSPs**



# Window Shift Register SET (WSR-SET)

e.

#### **Memory blocks**

Perform writing and reading operation of a specific pattern and retrieving the number of upset in the memory

#### PLLs



### CRAM

Check the SEU inside the CRAM of the FPGA



Field Programmable Gate Array (FPGA) Single Event Effect (SEE) Radiation Testing, Melanie Berg, 2012







### But FE test is not enough..

#### Limitations of FE test

- Not representative of user design
- Failure rate estimation is not possible

#### **Benchmark circuits: Advantages**

- Perform standardized testing
- Estimate the device failure rate independently from the application
- Easier to compare FPGA between each other even if they belong to different families and are based on different technology

02/03/2022

- Benchmark adopted: B13
  - Available [Here]
  - Based on FSM

**EN** 

Controls

Electronics &

Mechatronics

R2E



### Radiation effects on FPGAs - TID



The maximum frequency of digital circuit depends on the propagation delay  $t_{pd}$ 

- TID effects can lead to  $t_{pd}$  degradation
  - i.e. The PROASIC3 showed a  $t_{pd}$  degradation of 70% shows at 700 Gy [Here]

To measure the propagation delay change, a ring oscillator can be used



### How we test an FPGA

- Remove additional logic from the DUT
- Use another FPGA as TESTER
- Processor for interfacing with the TESTER
- PYNQ framework for interfacing the FPGA using Python
- External laptop with a Jupyter notebook connected through ethernet





(en



### **Qualification timeline**



### **FPGA** under evaluation



### NG-Medium - NX1H35AS

- SRAM-based FPGA
- Radiation Hardened by Design (RHBD)
- Configuration Memory Integrity Check (CMIC) for the CRAM
  - CMIC corrects single error, <u>but stops at double error</u>

02/03/2022



### PolarFire - MPF300TS

**FLASH-based FPGA** •

- 28-nm SONOS technology
- Flash CRAM cells resilient to SEU



Controls Electronics &

Mechatronics





### **PolarFire FEs results**

- High energy protons tests at PSI
- With and without mitigation technique (TMR)
- Thermal neutrons test at ILL
- All circuits in the same FPGA design

### **Results**

- Comparable cross-section (except for TMR)
- 2. TMR brings better improvement under thermal neutrons

Controls Electronics &

Mechatronics

 $(\bigcirc$ 

R2E

3. SEFI dominant failure type

ê







### Test results – Benchmark

### **Flash-based FPGAs**

- Comparable cross-section
- SM2 and ProASIC3 better with TMR
- SEFI represents most of the failure for the PolarFire
- PolarFire sensitive to ThN

### **NG-Medium**

- No SEU observed in the circuits

Controls

Electronics &

Mechatronics

**E**N

- CRAM corruption due to CMIC stop
- Poor performances of TMR because of poor routing implementation strategy due to old version of the tool

(0)

R2E

02/03/2022



### Propagation delay analysis

### **DUT circuit**

- 1900 ring oscillator
- Output MUX to select one ring
- TESTER FPGA as a frequency counter to measure their frequency

### **Test results**

- 5.5 kGy of cumulated dose
- Only 0,35% of degradation





### Lifetime and programmability comparison

02/03/2022

### Lifetime (Exponential performance degradation)

- ProASIC3 and SmartFusion2 showed exponential increase in propagation delay
- No failure observed for the NG-Medium up to 3 kGy
- PolarFire only 0.35% propagation delay change up to 5.5 KGy

### **Programmability**

- Very low threshold for SmartFusion2 and ProASIC3
- PolarFire and NG-Medium better: 5.5 kGy and 2.2 kGy

R2E

Controls Electronics &

Mechatronics





### Failure rate estimation in the HL-LHC

02/03/2022

Average number of failures (for one device) in 12 years of operation

### Flash based FPGAs

- Similar performances in areas with low R factor
- PolarFire failures induced by ThN are not negligible
- → Not considering ThN would lead to an underestimation of the polar fire failure rate

#### **NG-Medium**

- Slightly lower failure rate
- High failure rate because of configuration loss due to radiation induced resets
- $\rightarrow$  Loosing configuration implies using an external flash memory, but no good candidate has been found

R2E

Controls Electronics &

Mechatronics



### **Conclusions and outlook**

A methodology for FPGA radiation testing

Controls Electronics &

Mechatronics

- Improved test setup with a TESTER FPGA
- An evaluation process that includes assessment of TID effects and HEH + ThN sensitivity using both functional elements and benchmark circuits
- > Two FPGAs qualified for the LHC radiation environment
  - > The final choice depends on programmability, lifetime, HEH, ThN sensitivity, application criticality
- Results submitted to IEEE Transaction on Nuclear Science:
  - A. Scialdone et al., "FPGA Qualification and Failure rate estimation methodology for the LHC radiation environment using benchmark test circuits"
- > Ongoing collaboration with Politecnico di Torino for other benchmark circuits
- Starting the evaluation of a new FPGA: GateMate from CologneChip

## Thanks for your attention!





