6–10 Oct 2025
Rethymno, Crete, Greece
Europe/Athens timezone

A Versatile Readout System for Front-End ASICs with HLS-based Hardware-Accelerated Processing Capabilities

10 Oct 2025, 09:00
16m
AQUILLES, Aquila

AQUILLES, Aquila

Oral Programmable Logic, Design and Verification Tools and Methods Logic

Speaker

Valerio Pagliarino (INFN Torino and Politecnico di Torino)

Description

This paper presents a versatile readout system for particle detector front-end ASICs based on the AMD Zynq Ultrascale+ System-on-Chips. The system is suitable for both extensive laboratory characterization and for data acquisition at testbeam facilities. Its software-level scripting of the test procedure reduces the firmware development effort, maximizing the system reusability among different DUTs. At the same time, the integration with High-Level Synthesis flows allow the deployment of demanding pipelined and parallel processing algorithms in hardware, offloading the ARM processor. In the presentation the architecture of the system is presented along with the experimental results obtained in different use cases.

Summary (500 words)

This paper presents a versatile readout system for mixed-signal front-end ASICs, that leverages the capabilities of modern System-on-Chips, with a software-oriented approach.

The system is based on a AMD Zynq Ultrascale+ XCZU9EG SoC connected to the ASIC digital interfaces, complemented by a 5 GS/s sample rate arbitrary waveform generator for emulating realistic signals from particle detectors and a 8 GHz bandwidth, 25 GS/s sample rate Tektronics LPD64 low-profile digitizer for analog signal acquisition. These devices are synchronized to a unique clock timebase and interconnected using a 1/10 Gbps network.

The system is designed to maximize the reusability among different DUTs thanks to device-independent and test-independent firmware and software components. These provide 4 parallel 128-bit DMA FIFOs streaming the incoming data from the ASIC data links to the embedded multi-core ARM CPU with up to 1920 MBps bitrate. In addition, 16 GPIO buses and 4 I2C/SPI highly-customizable master controllers are present for managing the slow control of a variety of ASICs. A logic analyzer is embedded as well in the FPGA. Coded in SystemVerilog, these firmware blocks are managed by a unified C++/Python library running on the programmable-system side of the SoC.

Thanks to this approach, when a new ASIC must be tested, the firmware-development flow is reduced to the design of a small chip-dependent component that handles the deserialisation of the serial links and the electrical interfacing of the signals with the available control buses and data FIFOs. Therefore, the main effort is moved to the software-side, consisting in the development a Python Jupyter-Notebook test procedure, forking a base project. This interface, taking advantage of the unified library, loads custom waveforms to the detector emulator, probes analog signals, performs control operations and receives the data from the ASIC, in an integrated environment including the Python scientific libraries and the CERN ROOT Framework. A large variety of RS232 and Ethernet laboratory equipment including clock generators, climatic chambers and moving stages can be controlled by the test Python-script.

A High-Level Synthesis flow allows to deploy complex signal-processing, image-processing or clustering algorithms on the programmable-logic. Starting from a data processing algorithm coded in C++, parallelisable functions are detected, optimized and decorated with specific PRAGMA instructions. At this point, the AMD Vitis HLS tool translates these operations into a packaged Verilog IP that communicates with the main CPU using using Memory-Mapped AXI resources. In this way, pipelined and parallel calculations can be executed efficiently, offloading the ARM processor.

This integrated test system has currently been used for the readout of the ALCOR 32-channel ASIC and of the MADPIX monolithic sensor at the DESY testbeam facility. In this last case study, the SoC has been interfaced with the AIDA apparatus as well, for synchronizing the sensor acquisition with the Mimosa tracker.

Finally, a use case involving the HLS acceleration of post-processing algorithm is currently in test phase, consisting of X-Ray tomographic image reconstruction from the Arcadia MD3 sensor. In this case, the filtered-backprojection (FBP) algorithm has been accelerated in firmware.

Authors

Alberto Bortone (Universita e INFN Torino (IT)) Angelo Rivetti (Universita e INFN Torino (IT)) Davide Falchieri (Universita e INFN, Bologna (IT)) Ms Emanuela Petrini (University of Turin) Manuel Dionisio Da Rocha Rolo Stefan Cristi Zugravel (INFN Torino (IT), DET Politecnico di Torino (IT)) Valerio Pagliarino (INFN Torino and Politecnico di Torino)

Presentation materials