Design of Finite State Machines for SRAM-based FPGAs operated in radiation field

3 Sept 2019, 17:20
20m
Poster Radiation Tolerant Components and Systems Posters

Speaker

Matteo Lupi (CERN / Johann-Wolfgang-Goethe Univ. (DE))

Description

For the CERN LHC Run 3, the ALICE experiment completely redesigned the Inner Tracking System, which now consists of seven cylindrical layers instrumented with 24120 Monolithic Active Pixel Sensors, covering an area of $10m^2$.
The ITS is controlled and read out by 192 custom Readout Units, which employ commercial SRAM-based FPGAs and will operate in an ionising radiation field, requiring specific FPGA design to ensure system reliability.
This contribution focuses on the techniques developed for designing radiation tolerant finite state machines, discussing the theoretical background, the actual implementation, and their validation with fault injections and proton irradiation tests.

Summary

The new ALICE Inner Tracking System (ITS) employs 24120 ALPIDE sensors throughout all its layers, with more than 12 billion pixels in total. The readout system, composed of 192 identical Readout Units (RUs), has complete control over all sensor operations, including power management and data readout.
Its reliability is, therefore, critical for the correct operation of the entire ITS.
The sensors directly drive the differential high-speed links connecting these to the readout, making it mandatory to place the readout system as close as possible to the detector to achieve the reliable transmission at the required bitrate.
The RUs will be placed at about five meters from the interaction point along the beam axis, and at a radial distance of about one meter.
The expected TID for the entire detector life cycle is about 10 krad (safety factor of 10), which does not raise concerns since all the system components have been validated.
Conversely, the expected flux of particles with sufficient energy (>20 MeV) to induce Single-Event Effects (SEEs) in modern microelectronic devices is of the order of $10^{3}$ $ s^{-1}cm^{-2}$, posing a challenge to the utilisation of commercial, SRAM-based FPGAs.
The options of designing a specific ASIC or employing rad-hard-by-design FPGAs have both been evaluated, but the lack of flexibility of the first and the big cost/performance penalty of the latter ruled these out.
Irradiation tests showed that, with the specific device employed, a Xilinx Kintex UltraScale XCKU060 FPGA, the whole system of 192 RUs will experience, on average, an SEE affecting the FPGA every 8 s (worst-case scenario).

An external scrubbing sub-system, driven by a flash-based FPGA, ensures the long-term stability of the FPGA design.
However, the SRAM-based FPGA design needs to deal with errors until the scrubbing cycle corrects them.
All Finite State Machines (FSMs) must keep their correct state even in case of errors, since the scrubbing can only restore the static configuration and not the logic state.

In digital design, FSMs are implemented with sequential logic, storing the present state, and combinatorial networks, by implementing state transition and output functions.
This contribution investigates the interplay between the combinatorial and sequential blocks, how errors happening in one, or both, propagate through successive state changes, and how they affect the final FSM behaviour.
The problem is approached, first from a theoretical point of view, providing exact solutions for the expected failure rate of an unprotected FSM and for different protection schemes.
An analytical model describes the system behaviour in such cases.
Subsequently, the best protection topologies were implemented into the real design and extensively tested by means of controlled error injections and a proton irradiation test.
The comparison of the analytical model with the test results on dedicated designs will be discussed.
Finally, this contribution will also illustrate how the selected protection schemes have been implemented in the FPGA design.

Primary authors

Matteo Lupi (CERN / Johann-Wolfgang-Goethe Univ. (DE)) Prof. Piero Giubilato (Universita e INFN, Padova (IT))

Co-authors

Matthias Bonora (CERN / University of Salzburg (AT)) Krzysztof Marek Sielewicz (CERN)

Presentation materials