17–21 Sept 2012
Oxford University, UK
Europe/Zurich timezone

First Results of Fault Injection Tests done to Study the Radiation Tolerance of the Readout Control FPGA Design of the ALICE TPC Detector

21 Sept 2012, 10:10
25m
Martin Wood Lecture Theatre (Oxford University, UK)

Martin Wood Lecture Theatre

Oxford University, UK

<font face="Verdana" size="2"><b>Clarendon Laboratory</b> Parks Road OX1 3PU, Oxford, United Kingdom
Oral Topical

Speaker

Johan Alme (Bergen University College (NO))

Description

The ALICE Time Projection Chamber (TPC) is the main tracking detector of ALICE. In the Readout Control Unit (RCU) an SRAM based FPGA from Xilinx controls the read out of data from the detector. Functional failures due to single event upsets are possible in the SRAM configuration memory of the FPGA. This paper presents the results of fault injection tests that have been performed to systematically study how configuration memory bit flips will impact the failure frequency.

Summary

Single Event Upsets (SEUs) are a major concern for electronic systems located in radiation environments. One such system is the TPC Readout Control Unit (RCU) of the ALICE experiment. The current version of the RCU is designed using an SRAM based Field Programmable Gate Array (FPGA) that collects data from up to 3200 separate readout channels upon reception of a trigger. The RCU formats and ships these data to the Data Acquisition (DAQ) system for storage and analysis. An SEU, which is defined as a radiation related bit-flip in a memory cell, may lead to corrupted data or, even worse, a system malfunction in the RCU. The latter situation will affect the operation of the ALICE detector since it causes a premature end of run. A dedicated reconfiguration solution has therefore previously been implemented to continuously detect and correct any SEUs in the configuration memory of the RCU FPGA. The solution is based on partial reconfiguration, an option offered by Xilinx for a number of FPGAs to reconfigure the configuration memory without interrupting the operation of the FPGA.
Because of the effect SEUs may have on the operation of the TPC detector, it is of vital importance to estimate how often functional failures will occur with varying luminosity. Based on the dedicated reconfiguration solution, this paper presents a way to do this by the means of fault injection, which essentially is to insert bit flips in the configuration memory of the FPGA in a laboratory environment. Fault injection also provides an opportunity to investigate the cause of the errors. This will help in improving the design to make it less susceptible to SEUs. The paper will present the results of the fault injection tests and show how these results can be combined with SEU measurement results to estimate functional failure rates.

Primary author

Johan Alme (Bergen University College (NO))

Co-authors

Attiq Ur Rehman (University of Bergen (NO)) Christian Lippmann (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE)) Dominik Fehlker (University of Bergen (NO)) Dr Ketil Roeed (University of Bergen (NO)) Kjetil Ullaland (University of Bergen (NO)) Magnus Mager (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE))

Presentation materials