# WP6, Investigation of Proton-Induced Single Event Effects on the Zynq-7000 System on Chip for On-Board Computing Applications in Space Missions

Jan Budroweit (DLR)
RADNEXT 2<sup>nd</sup> Annual Meeting – 9-10 May 2023
<a href="https://indico.cern.ch/e/radnext-2023">https://indico.cern.ch/e/radnext-2023</a>



# **Outline**

- Introduction
- SEE Test Results
  - ARM Caches
  - XADC
  - SEM IP-Core
- Conclusion



# Introduction

- DUT: Zynq-7000 (ZC7020)
- Platform: Customized Software-Defined Radio (SDR)
- Particles: Protons
- Facilities: PSI, PARTREC
- Energies: 230 MeV (PSI) and 184 MeV (PARTREC)
- Flux: 0.5 to 5 E+7 #/cm²/s (depending on the experiment)



### Objectives:

Influence of ARM caches configuration while running a operating system
Investigations of the integrated Xilinx XADC for system-integrated latch-up detection
Investigation of the SEM-IP Core as mitigation mechanism for SEUs in the FPGA bitstream



### **ARM Caches**

- Methodology:
  - Impact of SEEs while using and running an embedded Linux operating system
  - Use of different cache configurations of the dual ARM core (L1i, L1d and L2)
  - Tested while Linux is running and executing a dedicated program (matrix multiplication)
  - Linux kernel GNU debugger (KGDB) used to trace failure after system crash
  - Proton beam of PSI used at primary energy of 230 MeV and tailored flux (depending on the the cache configuration)



### **ARM Caches**

- Results:
  - SEFI results:
    - L1,L2 on: 1.92 x 10E-8 cm<sup>2</sup>/device
    - L1 off, L2 on: 1.08 x 10E-9 cm²/device
    - L1, L2 off: 1.52 x 10E-10 cm<sup>2</sup>/device
  - SEFIs origin is different:
    - Mostly happened by prefetch aborts (counter errors) coming from L1
    - When L1 is disable, errors are more diverse: unhandled paging request, null pointer references
    - All caches disabled error occurred mainly by wrong addresses being accessed by the kernel
  - No error observed in the data caches itself (by checking the matrix multiplication result)

### Kernel - Unhandled Faults





- Methodology:
  - XADC inputs connected to shunt-resistors and current sense amplifier in the SDR power domains
  - Data being transmitted to host pc by RS422 interface
  - ADC (12-bit) raw data used (no compression or additional data processing)
  - Bare-metal code being used (all caches disabled)
  - Voltages/Currents monitored with DAQ system in parallel







- Results:
  - No indication of SEUs in the ADC data





- Results:
  - No indication of SEUs in the ADC data





- Results:
  - No indication of SEUs in the ADC data
  - 12-bit are aligned to 16-bit samples (MSB), LSB of the data not noticed
  - Calibration of the XADC enabled (smoothing of data)
  - Deviation of the data due to dynamic load of the system ("noise")
  - Pulling the XADC inputs to ground would give better results



- Methodology :
  - Testing the SEM-Core at different flux level
  - Data from IP-Core being transmitted to host pc by RS422 interface (115200 baud)
  - Read-out FPGA configuration with JTAG
  - Further irradiation once the SEM-Core crashes to evaluate the SEUs in the CRAM (target fluence)
  - Two bitstreams / configuration tested (max. and min.)

| Flux [#/cm²/s] | 2nd run duration [s] | 2nd run Fluence [#/cm²] |
|----------------|----------------------|-------------------------|
| 0.5 E+7        | 60                   | 30 E+7                  |
| 1.0 E+7        | 30                   | 30 E+7                  |
| 5.0 E+7        | 15                   | 75 E+7                  |





- Results:
  - Flux dependency of SEM IP-Core





- Results:
  - Flux dependency of SEM IP-Core
  - SEM vs. JTAG-Readback





- Results:
  - Flux dependency of SEM IP-Core
  - SEM vs. JTAG-Readback
  - Error classification
    - "SED OK" 1-bit ECC errors
    - "DED" 2-bit ECC errors
    - "CRC" non-correctable errors
  - The duration of the first run shows a very high variance and degrades with increasing flux
  - No significant difference in SEE cross-section has been observed
  - CRAM completely checked independent of PL allocation
  - Essential-bit mask could be used to improve performance

| Flux [#/cm <sup>2</sup> /s] | SED OK  | DED    | CRC    |
|-----------------------------|---------|--------|--------|
| 0.5e+7                      | 9.9e-8  | 7.0e-9 | 1.4e-9 |
| 1.0e+7                      | 10.0e-8 | 7.0e-9 | 1.6e-9 |
| 5.0e+7                      | 6.8e-8  | 4.5e-9 | 0.8e-9 |

| Туре           | Duration [ms] |
|----------------|---------------|
| SED OK (1-bit) | 0.009±0.008   |
| DED (2-bit)    | 0.014±0.008   |

| Flux [#/cm²/s] | Mean duration [s] |
|----------------|-------------------|
| 0.5e+7         | 77.7±65.0         |
| 1.0e+7         | 50.5±55.6         |
| 5.0e+7         | 16.9±34.3         |



# Conclusion

- Right use of ARM cache configuration can improve system crashes up 800%
- Kernel always causes system crashes not the executed test program
  - Wrong addressing, wrong register values, prefetch aborts (L1i), unhandled page requests, null pointer reference and processor crashes (all caches disabled)
- The XADC is a nice solution for system-internal voltage/current monitoring instead of using additional ADCs
- No SEE indication found, but uncertainties needs to be considered (calibration enabled, dynamic input to XADC)
- SEM Core is a useful tool to detect an correct SEUs in the CRAM.
- Once the SEM Core cannot correct an error its stops operating an a reprogramming is required
- Essential bit mask could be helpful but requires additional an dedicated memory
- Further investigation on mapping the error reports to CRAM locations (decoding of addresses and logic).



# **Publications**

- M. Jaksch, J. Budroweit and F. Stehle, "Debugging Xilinx Zynq-7000 SoC Processor Caches during Linux System Execution under Proton Irradiation," 2022 IEEE Radiation Effects Data Workshop (REDW) (in conjunction with 2022 NSREC), Provo, UT, USA, 2022, pp. 1-4, doi: 10.1109/REDW56037.2022.9921631.
- F. Stehle, J. Budroweit and F. Eichstaedt, "Investigation of the Xilinx SEM Core on a Zynq-based Software-Defined Radio under Proton Irradiation," 2023 IEEE Radiation Effects Data Workshop (REDW). Accepted Paper
- F. Eichstaedt, J. Budroweit and F. Stehle, "Investigation of the Zynq-7000 Integrated XADC under Proton Irradiation," 2023 IEEE Radiation Effects Data Workshop (REDW). Accepted Paper



# Thanks for your attention!



Image Source: DLR

