Speaker
Description
Summary
Single Event Upsets (SEUs) are a major concern for electronic systems located in radiation environments. One such system is the TPC Readout Control Unit (RCU) of the ALICE experiment. The current version of the RCU is designed using an SRAM based Field Programmable Gate Array (FPGA) that collects data from up to 3200 separate readout channels upon reception of a trigger. The RCU formats and ships these data to the Data Acquisition (DAQ) system for storage and analysis. An SEU, which is defined as a radiation related bit-flip in a memory cell, may lead to corrupted data or, even worse, a system malfunction in the RCU. The latter situation will affect the operation of the ALICE detector since it causes a premature end of run. A dedicated reconfiguration solution has therefore previously been implemented to continuously detect and correct any SEUs in the configuration memory of the RCU FPGA. The solution is based on partial reconfiguration, an option offered by Xilinx for a number of FPGAs to reconfigure the configuration memory without interrupting the operation of the FPGA.
Because of the effect SEUs may have on the operation of the TPC detector, it is of vital importance to estimate how often functional failures will occur with varying luminosity. Based on the dedicated reconfiguration solution, this paper presents a way to do this by the means of fault injection, which essentially is to insert bit flips in the configuration memory of the FPGA in a laboratory environment. Fault injection also provides an opportunity to investigate the cause of the errors. This will help in improving the design to make it less susceptible to SEUs. The paper will present the results of the fault injection tests and show how these results can be combined with SEU measurement results to estimate functional failure rates.