Speaker
Meifeng Lin
(Brookhaven National Laboratory (US))
Description
OpenMP has been the programming model of choice for shared-memory parallelism on multi-/many-core CPUs for a long time. Recent additions to the OpenMP standard have also enabled the support for offloading certain computations to compute accelerators such as GPUs. This potentially allows us to have a single code written with OpenMP directives that can be executed on both CPU and CPU+GPU platforms. We evaluate the OpenMP offloading features in the context of GridMini, a set of mini-benchmarks based on the Grid C++ lattice QCD library. We will discuss our experience with porting GridMini to NVIDIA, AMD and Intel GPUs using OpenMP. Preliminary benchmark performances will also be presented.
Author
Meifeng Lin
(Brookhaven National Laboratory (US))