Speaker
Description
The CMS experiment requires massive computational resources to efficiently process the large amounts of detector data. The computational demands are expected to grow as the LHC enters the high-luminosity era. Therefore, GPUs will play a crucial role in accelerating these workloads. The approach currently used in the CMS software (CMSSW) submits individual tasks to the GPU execution queues, accumulating overhead as more tasks are submitted and using up more CPU resources.
CUDA and HIP graphs are a different task submission approach where a set of GPU operations are grouped together and connected by dependencies creating a directed task graph, that can later be executed as many times as required. This provides performance advantages over submitting individual tasks since a graph execution submits all GPU operations at once, reducing the launch overhead and freeing up CPU resources for other tasks to be executed.
A set of realistic tests that simulate different aspects of the CMS software were developed to measure the impact of using graphs, and their results were evaluated on different NVIDIA and AMD GPUs. Based on these results, work is ongoing to implement support for task graphs in alpaka, a performance-portable parallel programming framework used in the CMS software, to ensure efficient tasks submission and scheduling across different hardware architectures.
Experiment context, if any | CMS experiment |
---|