Speaker
Description
The High-Luminosity LHC will vastly increase both the volume and complexity of data to be processed within the CMS software framework (CMSSW), pushing computational throughput to its limits. Efficient use of accelerator hardware, especially GPUs, will be central to sustaining reconstruction and analysis performance under these conditions. Among the most impactful design choices for GPU-accelerated workloads is data layout, as memory-access patterns strongly influence the achievable level of coalesced reads and overall hardware utilization. Structure-of-Arrays (SoA) layouts naturally align with these requirements thanks to their contiguous, field-wise organization.
In this work, we present a generic and extensible SoA backend based on the Boost Preprocessor library, enabling highly portable and strongly typed data representations. The new system introduces MultiView, a mechanism that groups multiple SoA collections with identical schemas into a single logical entity. This abstraction removes the need for costly data reshaping, streamlines inter-module communication, and simplifies the design of downstream algorithms. A key outcome of this design is seamless interoperability with ML frameworks like PyTorch and SOFIE: SoA structures can be directly exposed as Tensors without transformation or memory copies, enabling fast heterogeneous inference workflows where machine-learning models operate natively on CMSSW event data.
Beyond in-memory layout optimization, we also investigate integration of NVIDIA GPUDirect Storage (GDS) to establish a direct, high-bandwidth I/O path between GPU memory and local or remote storage. By relieving the CPU of data-movement responsibilities, GDS has the potential to reduce latency and improve performance in I/O-bound workflows, an increasingly relevant challenge as CMS moves toward HL-LHC data rates.
Bibliography:
[1] M. Holzer, L. Beltrame, A. Bocci, F. Pantaleo, and S. Balducci, "User Story: Integration of ROOT RNTuple to CMSSW's SoA data structures," Nov. 2025.
[2] L. Beltrame, F. Pantaleo, A. Bocci, and E. Cano, "Evolution of Data Structures for Heterogeneous Reconstruction in CMSSW," 2025. doi: 10.17181/kd13h-42e08.