Speaker
Description
Efficient wide-area data transfers are vital for LHC and multi-site scientific workflows, but host-level configuration, encompassing network, storage, and CPU/memory resources, often constrains end-to-end performance. We present the results of a WLCG mini-capability challenge focused on host optimization using modern systems (RHEL 9, 25+ Gbps NICs, NVMe/SSD storage) across seven ATLAS and CMS production sites: FNAL, UCSD, UNL, BNL, AGLT2, MWT2, NET2, and Vanderbilt. Our approach integrates ESnet Fasterdata best practices for network tuning (TCP buffers, packet pacing, NIC offloads, ring buffers), storage optimization (I/O scheduler, queue depth, NUMA affinity), and automated state management using a new script fasterdata-tuning.sh (JSON-based save/diff/restore).
We conduct controlled baseline and tuned experiments, including synchronized global configuration sweeps, using representative transfer protocols (XRoot, HTTPS), diagnostic tools (perfSONAR, iperf3), and storage benchmarks (fio). Key metrics include transfer throughput, completion time, host CPU utilization, %iowait, and error/retransmit rates, analyzed with statistical confidence. The study prioritizes reproducibility and evaluates operational compatibility with dCache, Xrootd, EOS, and ongoing production workloads.
Outcomes will provide concrete, site-level tuning recommendations and an objective assessment of host optimization as a potential WLCG best practice. Results, methodology, site experiences, and deployment advice will be presented at CHEP 2026.