Speaker
Description
In small-scale scientific infrastructures typically consisting of 3–7 nodes, industry-standard orchestrators like Kubernetes often introduce an "operational gap" due to their resource-heavy control planes. Furthermore, traditional overlay networks such as VXLAN introduce significant latency and CPU overhead, which hinders the performance of data-intensive distributed scientific computing. Manual management of Podman containers via CLI or systemd is frequently error-prone and lacks the centralized management logic required for high availability. To address these challenges, this study introduces the "IPVLAN Container Manager," a specialized orchestration framework designed to automate the deployment of HA containers while maintaining native-level network performance. The system was implemented across a 3-node cluster managing critical services such as dCache Frontend, HTCondor-CE, and site-BDII/APEL. By utilizing IPVLAN L2/L3 orchestration, the framework allows containers to communicate directly through the host's physical NIC, bypassing bridge-and-NAT overhead to provide deterministic low-latency environments essential for scientific workloads. The software leverages Podman Quadlet as a declarative engine to convert simplified container definitions into robust systemd unit files for standardized lifecycle management. For enterprise-grade reliability, it integrates with Pacemaker and Corosync to monitor container health and trigger automated failover across nodes. To ensure data integrity, STONITH fencing and DRBD-based replication were implemented to prevent split-brain scenarios and maintain strict consistency. Ultimately, this architecture allowed for the successful deployment of a CMS Tier-2 site with minimal network overhead, providing a highly predictable environment for data-heavy research while maintaining a seamless migration path for future transition to Kubernetes.