28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Name: 28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)
Start: 2026-05-25T08:00:00+07:00
End: 2026-05-29T14:00:00+07:00
Location: Chulalongkorn University

25–29 May 2026

Chulalongkorn University

Asia/Bangkok timezone

IPVLAN Container Manager: A High-Performance, Fault-Tolerant Orchestration Framework for Distributed Scientific Computing

Not scheduled

18m

Chulalongkorn University

Poster Presentation Track 7 - Computing infrastructure and sustainability Poster

Dr Geonmo Ryu (Korea Institute of Science & Technology Information (KR))

In small-scale scientific infrastructures typically consisting of 3–7 nodes, industry-standard orchestrators like Kubernetes often introduce an "operational gap" due to their resource-heavy control planes. Furthermore, traditional overlay networks such as VXLAN introduce significant latency and CPU overhead, which hinders the performance of data-intensive distributed scientific computing. Manual management of Podman containers via CLI or systemd is frequently error-prone and lacks the centralized management logic required for high availability. To address these challenges, this study introduces the "IPVLAN Container Manager," a specialized orchestration framework designed to automate the deployment of HA containers while maintaining native-level network performance. The system was implemented across a 3-node cluster managing critical services such as dCache Frontend, HTCondor-CE, and site-BDII/APEL. By utilizing IPVLAN L2/L3 orchestration, the framework allows containers to communicate directly through the host's physical NIC, bypassing bridge-and-NAT overhead to provide deterministic low-latency environments essential for scientific workloads. The software leverages Podman Quadlet as a declarative engine to convert simplified container definitions into robust systemd unit files for standardized lifecycle management. For enterprise-grade reliability, it integrates with Pacemaker and Corosync to monitor container health and trigger automated failover across nodes. To ensure data integrity, STONITH fencing and DRBD-based replication were implemented to prevent split-brain scenarios and maintain strict consistency. Ultimately, this architecture allowed for the successful deployment of a CMS Tier-2 site with minimal network overhead, providing a highly predictable environment for data-heavy research while maintaining a seamless migration path for future transition to Kubernetes.

Dr Geonmo Ryu (Korea Institute of Science & Technology Information (KR))

IPVLANContainerManagerForSmallCluster.pdf

IPVLANContainerManagerForSmallCluster.pptx

28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

IPVLAN Container Manager: A High-Performance, Fault-Tolerant Orchestration Framework for Distributed Scientific Computing

Chulalongkorn University

Speaker

Description

Author

Presentation materials