# **CIBDS FMECA**

### **BISv2** Reliability Study Progress Meetings



### **Project statistics**

#### Total of 794 components

- Capacitors and resistors in total make up almost 80%
  - Average failure rate of 1.8 FITS
- 23 transistors
  - Failure rate doubled if dual
- Critical component: FPGA
  - Failure rate of 11 FITS assumed universally to each critical failure mode





### Failure rates across various design pages Total: 2,788 FITS

FITS of design pages Predicted number of failures in 10<sup>9</sup> hours





### **Breakdown of end-effect options**

- Blind failure
  - <u>Blind</u>
  - Blind synchronous (only synchronous request sent)

merged

- Blind asynchronous never used
- Link mode
  - Blind sent external
  - Blind generate external
- False dumps
  - Asynchronous false dump in local
  - <u>Asynchronous false dump</u>
  - False dump synchronous
  - False dump
- Maintenance

→ 1 in 1,000 years

 $\rightarrow$  1 in a year

→ 1 in 10 years

→ 1 in a year







### **Blind failures** Total: 15.5 FITS

- **Contributing components:**
- OSC1, Oscillator; Parameter change
  **7.8 FITS (α 100%) x2**

#### **Previously in this category:**

- 1. IC32, Artix-7 FPGA; Short **11 FITS (α 100%)** 
  - ONLY with additional factors (see comment) –
    1st order failure: blind sync

FPGA wouldn't meet timing requirements, could potentially lead to a blind failure

if short between SFP RX/TX pins (pins next to each other) -> blind sync + short between pulse to TDU and GND -> blind async



### **Blind Sync – missing asynchronous request** Total: 39.5 FITS

#### **Contributing components:**

- IC19, MOSFET Drivers; open, short 2 x 1.4 = 2.8 FITS
- IC17, Optocoupler; diode/transistor stuck close/short, short from diode
  2.3 FITS
- 3. IC29, Buffer; Open, short, stuck low **0.7 FITS**
- 4. IC32, Artix-7 FPGA; Short **11 FITS (α 100%)**

if considered in local mode, probably not true because feedback to TSU is checked I suppose

And: two resistors, a capacitor and an AND gate (all below 1 FITS).



### Blind Link mode Total: 36.1 FITS

#### **Contributing components:**

- 1. IC32 Artix-7 FPGA; Stuck low: **11 FITS (α 100%)**
- 2. IC38 Spartan-7 FPGA; Stuck low: **11 FITS (α 100%)**
- 3. IC56 Single Buffer Gate; Short, open, stuck low: 0.7 FITS (α 50%)
- 4. IC20 RS-485 Transceiver; Open, stuck low: 0.4 FITS (α 27%)
- 5. IC21 RS-485 Transceiver; Open: **0.24 FITS (α 17%)**
- 6. R142 Resistor; Open: **0.22 FITS (α 30%)**



### **False dump – asynchronous** Total: 45.4 FITS

#### **Contributors:**

- 1. IC32 Artix-7 FPGA; open, stuck high: **22 FITS (α 100%)**
- 2. ZD1 TVS Diode; short: **3.4 FITS (α 80%)**
- 3. IC17 Optocoupler; diode or transistor stuck open: 2.9 FITS (α 50%)
- 4. LM1 Elbow socket; open, poor contact, short: 2 FITS (α 100%)
- 5. D9 High-speed switching diodes; short: **1.8 FITS (α 80%)**

Remaining: gates, diodes, buffer, resistors, capacitor and ferrite bead contribute less than 1 FITS each.

#### All (aside from the FPGA) are in CP Front Panel IO page.



### **False dump – synchronous** Total: 668 FITS

#### Group contributors to synchronous false beam dumps

Components' categories contributing to the false beam dumps



Top contributors: OSC5 (Monitor FPGA), T15 Dual N-Channel Transistor (Power Management), IC28 FPGA (Monitor FPGA), OSC1 (ARTIX\_7\_FPGA\_IO), IC42 IC43 IC44 (Power Management), ICTX1 and ICRX1Translators



## Behaviour under short of decoupling capacitors?

"In many failure modes (shorts between VCC and GND for example), we would only have a false dump sync, without the asynchronous dump request. This would happen in both A and B paths simultaneously, because they are shared on the same board. The BIS post mortem analysis checks that each synchronous dump is followed by an asynchronous dump request (arriving with a longer latency). So I think it could potentially happen once, but we wouldn't be able to refill the machine (or even access the CIBDS) afterwards."

- Saw in uQDS project, that decoupling capacitor shorts (even partial & intermittent) can cause uncontrolled outputs on FPGA level
  - Is it clear how CIBDS FPGA will behave under such circumstances (cannot really find answer in documentation - <u>https://docs.amd.com/v/u/en-US/ug483\_7Series\_PCB</u>) ?
  - For uQDS it is an IGLOO FPGA, where brown-out detection is required -<u>https://onlinedocs.microchip.com/pr/GUID-2952C8AA-</u> <u>A592-489E-8058-3FD06065EDDB-en-US-</u> <u>2/index.html?GUID-F5331690-5A67-4B65-8B6F-</u> 4EBF72A08AD6





### Maintenance Total: 980 FITS





### Conclusions

#### Takeaways for the global model

- Blind failures: 15.5 FITS
  - Attributed to a single failure mode of an oscillator
- False beam dumps:
  - Asynchronous: 45.4 FITS
  - Synchronous: 668 FITS
- Link mode malfunction: 36.1 FITS





home.cern