## WP6, Results and Outcomes from Latest WP6 Test Campaigns in Radiation Facilities

André M. P. Mattos (IES/University of Montpellier) Almudena Lindoso (UC3M) RADNEXT 2<sup>nd</sup> Annual Meeting – 9-10 May 2023 <u>https://indico.cern.ch/e/radnext-2023</u>





This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No **101008126** 

RADNEXT 2nd Annual Meeting – 9-10 May 2023

### Outline

### • IES/UM: Test Campaigns

- CHARM: RISC-V/SoC
  - Power and logging challenges
- ChipIr: RISC-V/SoC/FPGA
- PSI: SRAM
  - Setup, capabilities, and results
- Conclusions



- CHARM: SoC
- ChipIr:
  - Microprocessors/SoC

uc3m

- FPGAs
- GPUs
- UltraScale+
- Publications
- Conclusions





### **IES/UM: Test Campaigns**





### **IES/UM: Test Campaigns**

|                      | Source      | Facility | Date   | Experiment     |
|----------------------|-------------|----------|--------|----------------|
| 1 <sup>st</sup> year | Protons     | PSI      | 12/21* | SDRAM/HyperRAM |
|                      | Heavy lons  | RADEF    | 02/22  | SRAM           |
|                      | Neutrons    | Chiplr   | 05/22* | RISC-V/SoC/NoC |
|                      | Protons     | PARTREC  | 06/22* | RISC-V/SoC     |
| 2 <sup>nd</sup> year | Mixed-Field | CHARM    | 10/22  | RISC-V/SoC     |
|                      | Neutrons    | Chiplr   | 11/22* | RISC-V/SoC     |
|                      | Protons     | PSI      | 12/22  | SRAM           |
|                      | Laser       | ESTEC    | 05/23  | SRAM           |

\*Test campaigns through TA calls



CHARM Irradiation room



CHARM Control room



### > CHARM: RISC-V/SoC

- Characterization of a fault-tolerant RISC-V systemon-chip in flash-based FPGAs
  - Two different board designs with same FPGA family, but different power regulators
  - Both boards with external SEL protection and current monitoring
  - External robust communication fixtures were used to extend the logging interfaces
  - Similar setups and conditions, but two very distinct experimental characteristics
  - Despite that, there is good correlation between results considering the useful operation time



[1] [Under review] Douglas A. Santos, André M. P. Mattos, et. al., "Enhancing Fault Awareness and Reliability of a Fault-Tolerant RISC-V System-on-Chip", Electronics, 2023.

[2] [Accepted] André M. P. Mattos, Douglas A. Santos, et. al., "Using HARV-SoC for Reliable Sensing Applications in Radiation Harsh Environments", IWASI, 2023.



### > CHARM: RISC-V/SoC >> Power regulators





|                       | TLV62565                                                                                                                          | EP53A7HQI                                                        | LM317                                                            |
|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------|------------------------------------------------------------------|
| Device type           | Switching regulator                                                                                                               | Switching regulator                                              | Linear<br>regulator                                              |
| Board critical<br>TID | ~ 19 krad                                                                                                                         | > 36 krad                                                        | > 36 krad                                                        |
| # SELs                | 1                                                                                                                                 | 1                                                                | 1                                                                |
| # Transients          | 1                                                                                                                                 | 7                                                                | 6                                                                |
| Observations          | After critical<br>TID, it was not<br>operational,<br>but current<br>limited. After<br>29 krad, it had<br>unlimited<br>consumption | Steady<br>current<br>increase<br>(voltage<br>decrease)<br>~ 6.5% | Steady<br>current<br>decrease<br>(voltage<br>increase)<br>~ 8.5% |



## > CHARM: RISC-V/SoC >> Logging interface

PARTREC

- Robust interface: UART ↔ RS-485
  - UART with flow control •
  - RS-485 transceiver ٠
  - IO buffer •

**High-level failures** 

Observation

Flexible and easy to use (transparent) •

Chiplr

[Neutrons]

None<sup>1</sup>

Cable: 125m (with patch

panel connections)

Baud Rate: 3Mbauds

- RJ45 connector/cable for physical connection •
  - Robust and often available in patch panels



<sup>1</sup>Functional testing only (loopback) with direct irradiation, <sup>2</sup>Only usage (no specific testing)



### > ChipIr: RISC-V/SoC

- Characterization of a fault-tolerant RISC-V system-on-chip in a SRAM-based FPGA
  - Adaptation from the flash-based port
  - External bitstream memory required
  - Sensitive configuration memory
    - Custom scrubbing and correction (alternative to the SEM IP) with integration to the SoC



|                                              | SoC in Flash-based FPGA<br>[Microchip] [M2S010] | SoC in SRAM-based FPGA<br>[Xilinx] [Zynq-7010] |
|----------------------------------------------|-------------------------------------------------|------------------------------------------------|
| Total Fluence [n/cm <sup>2</sup> ]           | 1.80x10 <sup>12</sup>                           | 5.83x10 <sup>10</sup>                          |
| Mean Fluence to Failure [n/cm <sup>2</sup> ] | 2.06x10 <sup>10</sup>                           | 3.60x10 <sup>8</sup>                           |
| Cross Section [cm <sup>2</sup> ]             | 4.98x10 <sup>-11</sup>                          | 2.78x10 <sup>⊸</sup>                           |
|                                              | L                                               | ·                                              |

~ 200x difference in robustness

#### Zybo:

- Zynq-7000 family
- Flash
- SDRAM

[3] [Under review] Douglas A. Santos, Pablo M. Aviles, André M. P. Mattos, et. al., "Hybrid Hardening Approach for a Fault-Tolerant RISC-V System-on-Chip", RADECS, 2023.



### > PSI: SRAM

- Investigation on radiation-induced single-event latch-ups in SRAM memories on-board PROBA-V mission
  - Understanding the flight behavior and error rates
- Utilization of an experimental approach attending the target environment
  - RADEF: Heavy lons
  - PSI: Protons
  - ESTEC: Laser (further investigation of the observed phenomena)
- Development of an experimental setup for enhanced observability





[4] [Under review] André M. P. Mattos, Douglas A. Santos, et. al., "Investigation on Radiation-Induced Single-Event Latch-up in SRAM Memories on-Board PROBA-V", RADECS, 2023.

[5] [Under review] André M. P. Mattos, Douglas A. Santos, et. al., "Instrumentation and Methodology for In-Depth Analysis of Single-Event Latch-up on Memory Devices", Journal of Instrumentation, 2023.



### > PSI: SRAM >> Test Setup

- Enhanced experimental setup
  - Precise **timing** and current measurements
  - Coherent monitoring
    between memory errors and
    current measurements
  - Many test modes with
    realistic stimuli
  - Robust and flexible test setup





### > PSI: SRAM >> Results



- Error accumulation during SEL events
  - Example within a dynamic test
  - 50ms "hold time" and 200ms "cut time"

- Event cross sections
  - Good correlation between lots
  - Weibull fitting





- Tests at CHARM presented many setup challenges to be addressed before the campaign
- Tests at ChipIr provided more insights on the capabilities of Flash- and SRAMbased FPGAs for radiation testing
- Tests at PSI allowed the elaboration of an enhanced setup and provided practical insights
- We are currently working in the **first WP6 milestone**, which will summarize the setup preparation experience obtained in these **first 2 years of the project**
- We intend to elaborate **more publications** using the acquired data and experience



### **UC3M: Test Campaigns**





### **UC3M: Test Campaigns TA RADNEXT**

|                      | Source            | Facility     | Date         | Experiment                         |         |
|----------------------|-------------------|--------------|--------------|------------------------------------|---------|
| 1 <sup>st</sup> year | Protons           | PSI          | 12/21        | µp/FPGA/SoC                        |         |
|                      | Neutrons          | Chiplr (1/3) | 05/22        | µp/FPGA/SoC/GPUs                   |         |
|                      | Neutrons          | Chiplr (2/3) | 09/22        | µp/FPGA/SoC/GPUs                   |         |
| 2 <sup>nd</sup> year | Mixed field       | CHARM        | 10/22        | µp/SoC                             |         |
|                      | Neutrons          | Chiplr (3/3) | 11/22        | µp/FPGA/SoC/GPUs <mark>√ Co</mark> | mpleted |
| 2rd                  | Protons           | PARTREC      | 09/23        | µp/FPGA/SoC                        |         |
| 3 <sup>re</sup> year | X-ray (microbeam) | ESREF        | Under review | µp/FPGA/SoC/Memories               |         |



CHARM

### CHARM

- 2 commercial boards (zybo)
- Microprocessor hardening techniques
- Complex setup
- After 6.31krad (~20 hours), board non responsive

### > ChipIr: Experimental setup

- Experiments UC3M, complex COTS systems:
  - 8x SoC Xilinx Zynq-7000 (28 nm): Microprocessors, FPGA, SoC
  - 6x Jetson Nano (20 nm): Quad-core A57 & NVIDIA Maxwell GPU
  - 1x SoC Xilinx UltraScale+ (16 nm FinFET)







✓ Several experiments per board type

### > ChipIr: Experimental setup

- Experiments UC3M, complex COTS systems :
  - 8x SoC Xilinx Zynq-7000 (28 nm): Microprocessors, FPGA, SoC
  - 6x Jetson Nano (20 nm): Quad-core A57 & NVIDIA Maxwell GPU
  - 1x SoC Xilinx UltraScale+ (16 nm FinFET)

Experimental Setup (November)







ChipIr RADNEXT 2nd Annual Meeting – 9-10 May 2023

## > ChipIr: Experimental setup





- Zynq & UltraScale+: serial communication
- GPU: Ethernet connection
- External host (Raspberry Pi) to control the experiments & Power cycles
- UltraScale+: SEL detector





## > ChipIr: Evaluation of events vs board location

- Zynq 7000
- 3 Different hardening techniques
  - PL
  - SoC (PL+PS)

| Experiment          | ∆ Errors (%)           |
|---------------------|------------------------|
| dCR_RRR (PL)        | - 13.37 (+3 positions) |
| NIR_RRR_noSEM (SoC) | - 47.50 (+5 positions) |
| NIR_RRR_SEM (SoC)   | +4.21                  |



Position in the stack

- NIR\_RRR\_SEM behaves differently
  - Includes Xilinx SEM IP: Detects & corrects PL errors

| Board position | SEM corrections | PL+PS events |
|----------------|-----------------|--------------|
| <br>5          | 978             | 205          |
| 6              | 1119            | 214          |



### > ChipIr: Cross-section

### **Comparison different particles/ different complex systems**





### > ChipIr: Cross-section

#### Comparison CHARM/ChipIr complex system microprocessor-based hardening technique

• Based in: P. M. Aviles, A. Lindoso, J. A. Belloch, M. Garcia-Valderas, Y. Morilla and L. Entrena, "Radiation Testing of a Multiprocessor Macrosynchronized Lockstep Architecture With FreeRTOS," in IEEE Transactions on Nuclear Science, vol. 69, no. 3, pp. 462-469, March 2022 doi: 10.1109/TNS.2021.3129164.

|                |                                                         | CHARM                  | Chiplr                   |                  |
|----------------|---------------------------------------------------------|------------------------|--------------------------|------------------|
|                | Fluence [particles/m <sup>2</sup> ]                     | 5.67x10 <sup>11</sup>  | 2.57x10 <sup>11</sup>    |                  |
| Microprocessor | Cross-section [cm <sup>2</sup> ]<br>(Total events)      | 2.35x10 <sup>-9</sup>  | 9.64x10 <sup>-9</sup>    | ✓ Up to 2 orders |
| technique      | Cross-section [cm <sup>2</sup> ]<br>(Undetected errors) | 1.23x10 <sup>-11</sup> | 5.84x10 <sup>-11</sup> 🧲 | improvement      |
|                | Error rate [errors/hour]                                | 114.12                 | 105.13                   |                  |
|                | Time [hours]                                            | 11.66                  | 23.56                    |                  |



### > ChipIr: SoC UltraScale+

- Small number of events
- Microlatchup (under SEL stablished limit=500 mA)
- The same test was performed at CNA (15 MeV protons)
  - Small number of events
    - Flux was increased: TID effects observed (persistent current degradation)
  - No SEL observed
- Additional tests will be performed in the next TA campaigns (PARTREC)







CNA





# Publications UC3m

- 1. **RADECS 2022 Data Workshop:** P. M. Aviles, L. A García-Astudillo, J.A. Belloch, L. Entrena, A. Lindoso, "Comparative of proton radiation data for 28 nm Zynq-7000 SoC"
- IEEE TNS (Special issue RADECS 2022) : L. A. García-Astudillo, A. Lindoso, et al., "Evaluating Reduced Resolution Redundancy for Radiation Hardening in FPGA designs" doi: 10.1109/TNS.2023.3268825

#### **Under Review:**

- RADECS 2023
  - 1. D. A. Santos, P.M. Aviles, A.M.P. Mattos, M. Garcia-Valderas, L. Entrena, A. Lindoso, L. DiLillo, "Hybrid Hardening Approach for a Fault-Tolerant RISC-V System-on-Chip"
  - 2. G. Leon, J.M. Badia, J.A. Belloch, M. Garcia-Valderas, A. Lindoso, L. Entrena, "Analysing the influence of memory and workload on the reliability of GPUs under radiation"

Journal publications & conference communications in progress



# Conclusions UC3m

- Successful test of complex systems in different facilities
  - SoC, hard-core & soft-core microprocessors, FPGA and GPUs
- Combine different experiments/different boards & sensitivities for a radiation campaign
- Challenges in testing complex devices:
  - SoC
  - GPUs
- Comparison of irradiation campaigns results for different facilities/particles
- Developing guidelines for non-expert end users





Image Source: IES/UM

### **Thanks for your attention!**



André M. P. Mattos (IES/UM): andre.martins-pio-de-mattos@etu.umontpellier.fr



Almudena Lindoso (UC3M): alindoso@ing.uc3m.es



