28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Name: 28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)
Start: 2026-05-25T08:00:00+07:00
End: 2026-05-29T14:00:00+07:00
Location: Chulalongkorn University

25–29 May 2026

Chulalongkorn University

Asia/Bangkok timezone

ODGDFS: A High-Performance User-Space Object Storage Engine for HEP Data Challenges

27 May 2026, 13:45

18m

Chulalongkorn University

Oral Presentation Track 1 - Data and metadata organization, management and access Track 1 - Data and metadata organization, management and access

zhuo meng (Institute of High Energy Physics)

Currently High Energy Physics (HEP) faces increasingly severe data storage challenges. Next-generation particle collider experiments are expected to generate unprecedented data volumes and acquisition rates, demanding continuous I/O capabilities with sub-milliseconds PB/s-level throughput. Traditional kernel-based file systems, burdened by context switching, interrupt handling, and heavy metadata overheads, struggle to fully unleash the performance potential of emerging NVMe SSD hardware, becoming a critical bottleneck in experimental data processing and analysis pipelines.

To address this, we present ODGDFS, a user-space object storage engine optimized for HEP data access patterns, originating from the JwanFS project at IHEP. Built upon the SPDK Blobstore, the system minimizes software stack overhead by completely bypassing the OS kernel and employing lock-less polling I/O with a lightweight metadata architecture. Its core innovations include:

Flat Metadata Organization: Designed a custom superblock-backed metadata scheme that utilizes in-memory hash indexing to achieve O(1) complexity for object localization, effectively eliminating the overhead of multi-level directory lookups found in traditional file systems;

Zero-Copy Tail Cache: Proposed a tail cache aggregation mechanism to optimize small-scale asynchronous write patterns common in HEP experiments, significantly reducing write amplification while boosting sequential write throughput;

Stream Decoupling & Lazy Loading: Implemented the logical decoupling of data and index streams alongside a lazy-loading architecture, maintaining efficient memory and CPU utilization while supporting thousands of concurrent data volumes.

Preliminary benchmark based on rigorous stress testing has confirmed the system's stability and correctness under high-concurrency simulated workloads. it is foreseeable that when handling typical HEP workloads, ODGDFS will demonstrate significant improvements in I/O throughput and stable low-latency performance, providing a scalable and efficient storage solution for managing massive datasets in future large-scale experimental data centers.

zhuo meng (Institute of High Energy Physics)

LI Haibo lihaibo Dr Yaodong CHENG (Institute of High Energy Physics, Chinese Academy of Sciences) Yuanming Tang (IHEP) Dr Yujiang BI (Institute of High Energy Physics, Chinese Academy of Sciences) 隗立畅 weilc (IHEP)

There are no materials yet.

28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

ODGDFS: A High-Performance User-Space Object Storage Engine for HEP Data Challenges

Chulalongkorn University

Speaker

Description

Author

Co-authors

Presentation materials