25–29 May 2026
Chulalongkorn University
Asia/Bangkok timezone

Distributed Performance Testing of High-Speed Scientific Networks in Preparation for Exabyte Scale Workflows

27 May 2026, 16:51
18m
Chulalongkorn University

Chulalongkorn University

Oral Presentation Track 7 - Computing infrastructure and sustainability Track 7 - Computing infrastructure and sustainability

Speaker

Lael Verace (University of Wisconsin-Madison (US))

Description

The next generation of scientific experiments, particularly those found in high energy and nuclear physics, will produce unprecedented data volumes which will push scientific computing infrastructures to rely on terabit-scale networks for rapid, reliable data movement between globally distributed facilities. In parallel, advances in artificial intelligence continue to significantly increase data transfer requirements between sites. Evaluating whether these networks can meet future demands requires performance measurements that go beyond single-host tests. While existing tools effectively measure maximum throughput between individual servers, they cannot assess end-to-end performance across entire computing centers, which depends on coordinated, parallel traffic from multiple nodes.

To address this limitation, we present a distributed load-generation method designed for large-scale, site-level network evaluation. Our approach dynamically deploys lightweight software containers across multiple computing nodes, each capable of generating, coordinating, and monitoring high-volumes of artificially generated network traffic based on user-specified parameters. This enables realistic, scalable testing that reflects the distributed data movement patterns expected across a wide range of upcoming scientific projects.

We outline the system architecture, orchestration workflow, and mechanisms that allow the framework to scale to large numbers of parallel traffic sources. We demonstrate that this method provides a more accurate characterization of inter-site network performance compared to traditional single-host tools. Preliminary results using USCMS computing sites as a test case highlight its effectiveness in revealing bandwidth limitations, parallel flow behavior, and other wide-area network characteristics that are not observable with existing measurement approaches.

Authors

Lael Verace (University of Wisconsin-Madison (US)) Syed Asif Shah (Fermilab (FNAL)) Oliver Gutsche (Fermi National Accelerator Lab (US)) Philip Demar (Fermi National Accelerator Lab (US)) Prof. Kevin Black (University of Wisconsin-Madison)

Presentation materials

There are no materials yet.