Speaker
Description
IceCube Neutrino Observatory is a neutrino detector located at the South Pole. Here we present experiences acquired when using HTCondor to run IceCube’s GPU simulation worksets on the Titan supercomputer. Titan is a large supercomputer geared for High Performance Computing (HPC). Several factors make it challenging to use Titan for IceCube’s High Throughput Computing (HTC) workloads: (1) Titan is designed for MPI applications, (2) Titan scheduling policies heavily favor very large resource reservations, (3) Titan compute nodes run a customized version of Linux, (4) Titan compute nodes cannot access outside network. In contrast, IceCube’s simulation workloads consist of large numbers of relatively small independent jobs intended to run in standard Linux environments, and may require connectivity to public networks. Here we present how we leveraged HTCondor batch scheduler within Singularity containers to provide an HTC-friendly interface to Titan suitable for IceCube’s GPU workloads.