The 38th International Symposium on Lattice Field Theory

Name: The 38th International Symposium on Lattice Field Theory
Start: 2021-07-26T00:00:00-04:00
End: 2021-07-30T18:00:00-04:00
Location: No location set

26–30 Jul 2021

US/Eastern timezone

Help desk

lattice2021-registration@mit.edu

D9: Use tensor cores to accelerate math intensive kernels in QUDA

28 Jul 2021, 15:00

Poster Software development and Machines Poster

Jiqun Tu (NVIDIA Corporation)

We will present our recent efforts on using tensor cores, which are available on NVIDIA GPUs starting from the Volta architecture, to speed up the math intensive kernels in QUDA. A light-weighted abstraction of the CUDA PTX matrix multiply-add (MMA) instruction is added in order to efficiently stage data through the different layers of GPU memory. Specifically the efforts include:

Use tensor cores to accelerate the 5th dimension DWF operators in the multi-splitting preconditioned conjugate gradient algorithm, utilizing the HMMA tensor core instruction;
Use tensor cores to accelerate the dense matrix multiplications in the set up steps in multi-grid;
Use tensor cores to accelerate the math intensive multi-BLAS kernels;
Use double precision DMMA instruction to accelerate the contraction workflow.

Jiqun Tu (NVIDIA Corporation) Evan Weinberg Kate Clark (NVIDIA) Mathias Wagner (NVIDIA)

poster.pdf

The 38th International Symposium on Lattice Field Theory

Help desk

D9: Use tensor cores to accelerate math intensive kernels in QUDA

Speaker

Description

Authors

Presentation materials

Choose timezone

The 38th International Symposium on Lattice Field Theory

Help desk

Speaker

Description

Authors

Presentation materials