Speaker
Brian Paul Bockelman
(University of Nebraska (US))
Description
HTCondor is a well known platform for distributed high-throughput computing and often resembles a the Swiss-Army-knife of computing - there's a bit of something for everyone. With a user manual weighing in at about 1,100 printed pages, there's no wonder that sysadmins can overlook some of the most exciting features.
This presentation will be dedicated to uncovering the hidden gems for running HTCondor as a batch system - useful features that are well-hidden, under-appreciated, or very recently added. This broad overview will include topics in worker node resource management, scripting, monitoring, deployment, and debugging the system.
Primary author
Brian Paul Bockelman
(University of Nebraska (US))