25–29 May 2026
Chulalongkorn University
Asia/Bangkok timezone

Dynamic Timeslice Building for CBM Using UCX

Not scheduled
1m
Chulalongkorn University

Chulalongkorn University

Poster Presentation Track 2 - Online and real-time computing Poster

Speaker

Jan de Cuveland (Goethe University Frankfurt (DE))

Description

The CBM experiment at GSI/FAIR will investigate QCD matter at high baryon densities with a free-streaming, self-triggered detector readout delivering time-stamped data on approximately 5000 input links. Designed for aggregate data rates exceeding 1 TB/s, the First-level Event Selector (FLES) system performs timeslice building, aggregating these streams into overlapping processing intervals for online event reconstruction.

Years of production experience with the original Flesnet software stack at the mCBM FAIR Phase-0 experiment revealed limitations in the monolithic run concept, particularly regarding resilience against detector malfunctions and external failures. These challenges motivated a complete rewrite of the timeslice building infrastructure.

The new system introduces a central manager architecture that enables dynamic load balancing and fault tolerance across the FLES HPC cluster. Key innovations include wall-time-driven operation with opportunistic timeouts, dynamic calculation of timeslice components with flexible overlap, dynamic buffer management for improved memory efficiency, and support for live scaling of build nodes during data acquisition. Communication between senders, builders, and the central manager utilizes UCX (Unified Communication X), providing efficient RDMA transport over InfiniBand while maintaining flexibility for alternative network technologies.

The system gracefully handles non-ideal conditions: failing senders are bypassed after short timeouts, failing builders are automatically excluded from scheduling, and the manager can be restarted during active runs. Initial deployment in development setups demonstrates the system's operational readiness for SIS100 commissioning.

This work is supported by BMFTR (05P24RF3).

Author

Jan de Cuveland (Goethe University Frankfurt (DE))

Co-authors

Dirk Hutter (Goethe University Frankfurt (DE)) Volker Lindenstruth (Goethe University Frankfurt (DE))

Presentation materials

There are no materials yet.