# **Data Processing on FPGAs in Space**



----

FRGA

Thematic CERN School of Computing 2024 Peter Hinderberger Technical University of Munich (TUM)

June 12, 2024





# **Field-Programmable Gate Arrays**

Quick Recap

- Principle:
- Network of Registers, Look-up-Tables (LUT), RAM blocks, Digital Signal Processors (DSP), Buffers, Clock Circuitry, Transceivers etc.
- → Wire them flexibly
- → Hardware Description Language (HDL), High-Level Synthesis (HLS)
- + Flexible, inherently parallel, high number of IOs... + Power-efficient!
- Low clock frequencies: O(100) MHz vs. O(1) GHz on a modern CPU High programming complexity



Throughput





ORIGINS Excellence Cluster



# **Data Processing on Small Satellites**

Can't we shoot that into space?

Freelence Cluster Page 3

- Limited Power
- Limited Volume
- Limited Transmission
- Need for Reconfigurability
- Radiation Environment

# **Data Processing on Satellites**

Can't we shoot that into space?

#### Limited Power

- Limited Volume
- Limited Transmission
- Need for Reconfigurability
- Radiation Environment
- Several Mitigation Techniques (TMR, hardened components, scrubbing...)

From: https://www.logic-fruit.com/blog/ fpga/fpga-design-architecture-and-applications/

Input/Output Blocks

Logic Blocks

**FPGA** 

• Example: Energetic Particle Detector (EPD) aboard the ESA's Solar Orbiter Mission

GPU. CPU

- Trigger and Data Acquisition for several scientific instruments
- Central data processing, including softcore processor on Instrument Control Unit (ICU)

→ Reduce Data Bandwidth

cf. Rodriguez-Pacheco, Astronomy&Atrophysics 642, A7 (2020)

- Other Applications:
  - Remote sensing, hyperspectral imaging
  - Communications
  - Cryptography
  - Radar

. . .



From: https://en.wikipedia.org/wiki/Solar\_Orbiter



Programma

Pade 4

# A practical example

Need to make it compact

- Detector: 3D Scintillating Fiber Matrix as Tracking Calorimeter
- Platform: NanoSat
- Objective:
- Measure Antiproton Flux + Antiproton-to-Proton Ratio
- Requirements: Lower the bandwidth as much as possible Keep as much information as possible



Page 5





### ➔ "Compress"

### **Questions?**







# BACKUP

## **Compression Techniques**

Every sense of the word



"Compression" Lossless Filtering Lossy Trigger Entropy Coding Transformation-Based **Event Filter** Hough-ANS Fourier Wavelet Online-Analysis and Background Reduction Latent-Space/ Zero-Suppression **Dimensionality Reduction** Classical Approaches, **Neural Nets** Threshold PCA/ Length + Address SVD Coding

+ Combinations



# Kintex<sup>®</sup> UltraScale<sup>™</sup> FPGAs

|                            | Device Name                       | KU025 <sup>(1)</sup> | KU035     | KU040     | KU060     | KU085     | KU095             | KU115                |
|----------------------------|-----------------------------------|----------------------|-----------|-----------|-----------|-----------|-------------------|----------------------|
| Logic Resources            | System Logic Cells (K)            | 318                  | 444       | 530       | 726       | 1,088     | 1,176             | 1,451                |
|                            | CLB Flip-Flops                    | 290,880              | 406,256   | 484,800   | 663,360   | 995,040   | 1,075,200         | 1,326,720            |
|                            | CLB LUTs                          | 145,440              | 203,128   | 242,400   | 331,680   | 497,520   | 537,600           | 663,360              |
| Memory Resources           | Maximum Distributed RAM (Kb)      | 4,230                | 5,908     | 7,050     | 9,180     | 13,770    | 4,800             | 18,360               |
|                            | Block RAM/FIFO w/ECC (36Kb each)  | 360                  | 540       | 600       | 1,080     | 1,620     | 1,680             | 2,160                |
|                            | Block RAM/FIFO (18Kb each)        | 720                  | 1,080     | 1,200     | 2,160     | 3,240     | 3,360             | 4,320                |
|                            | Total Block RAM (Mb)              | 12.7                 | 19.0      | 21.1      | 38.0      | 56.9      | 59.1              | 75.9                 |
| Clock Resources            | CMT (1 MMCM, 2 PLLs)              | 6                    | 10        | 10        | 12        | 22        | 16                | 24                   |
|                            | I/O DLL                           | 24                   | 40        | 40        | 48        | 56        | 64                | 64                   |
| I/O Resources              | Maximum Single-Ended HP I/Os      | 208                  | 416       | 416       | 520       | 572       | 650               | 676                  |
|                            | Maximum Differential HP I/O Pairs | 96                   | 192       | 192       | 240       | 264       | 288               | 312                  |
|                            | Maximum Single-Ended HR I/Os      | 104                  | 104       | 104       | 104       | 104       | 52                | 156                  |
|                            | Maximum Differential HR I/O Pairs | 48                   | 48        | 48        | 48        | 56        | 24                | 72                   |
| Integrated IP<br>Resources | DSP Slices                        | 1,152                | 1,700     | 1,920     | 2,760     | 4,100     | 768               | 5 <mark>,</mark> 520 |
|                            | System Monitor                    | 1                    | 1         | 1         | 1         | 2         | 1                 | 2                    |
|                            | PCIe <sup>®</sup> Gen1/2/3        | 1                    | 2         | 3         | 3         | 4         | 4                 | 6                    |
|                            | Interlaken                        | 0                    | 0         | 0         | 0         | 0         | 2                 | 0                    |
|                            | 100G Ethernet                     | 0                    | 0         | 0         | 0         | 0         | 2                 | 0                    |
|                            | 16.3Gb/s Transceivers (GTH/GTY)   | 12                   | 16        | 20        | 32        | 56        | 64 <sup>(2)</sup> | 64                   |
| Speed Grades               | Commercial                        | -1                   | -1        | -1        | -1        | -1        | -1                | -1                   |
|                            | Extended                          | -2                   | -2 -3     | -2 -3     | -2 -3     | -2 -3     | -2                | -2 -3                |
|                            | Industrial                        | -1 -2                | -1 -1L -2 | -1 -1L -2 | -1 -1L -2 | -1 -1L -2 | -1 -2             | -1 -1L -2            |

# Example

7-Tap FIR-Filter with parallel implementation





From: https://vhdlwhiz.com/part-2-finite-impulse-response-fir-filters/



# **Particle Physics Processing on Small Satellites**

Example from our own research



- 1024 Fibers in 64 Readout-ASICs
- Signal-to-Background-Ratio: 10<sup>-7</sup> to 10<sup>-9</sup>
- Event Rates O(10<sup>5</sup>)
- Bragg-Curve Spectroscopy + Event-Topology



32 fibers

Annihilating Antiproton

60

40

20

0

# The Antiproton Flux In Space Mission

Event Topologies



