The 37th International Symposium on Lattice Field Theory (Lattice 2019)

Name: The 37th International Symposium on Lattice Field Theory (Lattice 2019)
Start: 2019-06-16T13:00:00+08:00
End: 2019-06-22T12:55:00+08:00
Location: No location set

16–22 Jun 2019

Asia/Shanghai timezone

Breaking the latency barrier: Strong scaling LQCD on GPUs

18 Jun 2019, 15:40

20m

Shimao 3B

Parallel Algorithms and Machines Algorithms and Machines

Mathias Wagner (NVIDIA)

The ability to strong scale is crucial for Lattice QCD simulations. Since the creation of the QUDA library for Lattice QCD on NVIDIA GPUs, this has always been a key development goal. Techniques like GPUDirect RDMA and NVLink allow for fast intra-node and inter-node data transfer and QUDA makes extensive use of them. However, API overheads and necessary synchronizations between GPU and CPU are increasingly limiting the ability to strong scale with MPI communication. Fine-grained GPU-centric communication provides a way out as it completely removes these bottlenecks by moving the communication to the GPU kernels. We will discuss the techniques that QUDA implements to achieve the best scaling with MPI and novel improvements using NVSHMEM for GPU-centric communication. Finally, we will show scaling results on x86 and POWER systems.

Kate Clark (NVIDIA) Mathias Wagner (NVIDIA) Evan Weinberg (NVIDIA Corporation)

Wagner_Lattice2019.pdf

The 37th International Symposium on Lattice Field Theory (Lattice 2019)

Breaking the latency barrier: Strong scaling LQCD on GPUs

Shimao 3B

Speaker

Description

Authors

Presentation materials

Choose timezone

The 37th International Symposium on Lattice Field Theory (Lattice 2019)

Speaker

Description

Authors

Presentation materials