Speaker
A. Martin
(QUEEN MARY, UNIVERSITY OF LONDON)
Description
We describe our experience in building a cost efficient High Throughput Cluster (HTC)
using commodity hardware and free software within a university environment.
Our HTC has a modular system architecture and is designed to be upgradable.
The current, second phase configuration, consists of 344 processors and 20 Tbyte of
RAID storage.
In order to rapidly install and upgrade software, we have developed
automatic remote system installation and configuration tools to deploy standard
software configurations on individual machines. To efficiently manage machines we
have written a custom cluster configuration database. This database is used to track
all hardware components in the cluster, the network and power distribution and the
software configuration. Access to this database and the cluster performance and
monitoring systems is provided by a web portal, which allows efficient remote
management in our low-manpower environment.
We describe the performance of our system under a mixed load of scalar and parallel
tasks and discuss future possible improvements.
Primary author
A. Martin
(QUEEN MARY, UNIVERSITY OF LONDON)