Speaker
Mr
Laurence Dawson
(Vanderbilt University)
Description
Introducing changes to a working high-performance computing environment is typically
both necessary and risky. Testing these changes can be highly manpower intensive.
L-TEST supplies a framework that allows the testing of complex distributed systems
with reduced configuration. It reduces setting up a test to implementing the specific
tasks for that test. L-TEST handles three jobs that must be performed for any
distributed test; task communication to move tasks to execution nodes, generation of
reproducible stochastic distributions of tasks, and collection of test results. Tasks
are communicated via a dynamic and configurable set of storage systems, these storage
systems can be reused for result collection, or a parallel set of systems may be set
up for this results. The task generation framework supplies a basic set of stochastic
generators along with framework code for calling these generators. The full workload
of tasks is generated by aggregating multiple generator instances, in order to allow
complex configuration of tasks. Although L-TEST does not restrict the tester to the
following cases, this paper identifies several use cases that are of particular
interest. The development of the L-STORE distributed file-system required testing for
both correctness and performance. This paper describes how L-TEST was used to test
both. Reads and write performance data, and integrity data were reported to separate
communicators and analyzed separately. The performance configuration of L-TEST was
also utilized, almost unchanged, to test a parallel file-system introduced to the
ACCRE parallel cluster. In addition to testing the performance and integrity of
file-systems, we describe how L-TEST can test the effect of planned changes on
several characteristics of a cluster supercomputer; these include network bandwidth
and latency and the task scheduling system for submission of jobs to the cluster.
Primary author
Mr
Laurence Dawson
(Vanderbilt University)
Co-authors
Prof.
Alan Tackett
(Vanderbilt University)
Prof.
Paul Sheldon
(Vanderbilt University)