In modern physics experiments, data analysis need considerable computing capacity. Computing resources of a single site are often limited and distributed computing is often inexpensive and flexible. While several large-scale grid solutions exist, for example DiRAC (Distributed Infrastructure with Remote Agent Control), there are few schemes devoted to solve the problem at small-scale. For the cases when light-weight grid computing is more desirable, our project provides a performant solution in connecting and managing the distributive computing resources of a small group. The components are all freely available as official Debian packages.
In our project, an interior gateway routing protocol was deployed over a mesh overlay network spanning over several remote sites, so that servers can communicate with each other with high security and reliability. The Slurm workload manager distributed the computing tasks to the idle servers. The unified storage view was formed together using NFS (Network File System) Version 4.2. We have measured the aggregated computing power for pleasingly parallel workloads and found it to be comparable to summation of that of the individual ones. The computing resource is more scalable and flexible.
|Consider for promotion||Yes|