Sep 24 – 27, 2019
CERN
Europe/Zurich timezone

Applying Vectorization to Lattice QCD Calculations

Sep 25, 2019, 11:15 AM
15m
80/1-001 - Globe of Science and Innovation - 1st Floor (CERN)

80/1-001 - Globe of Science and Innovation - 1st Floor

CERN

60
Show room on map

Speakers

Shun Xu (Computer Network Information Center, Chinese Academy of Sciences) Zhong Jin (Computer Network Information Center, Chinese Academy of Sciences)

Description

Lattice QCD is a fundamental non-perturbative approach to solving the quantum chromodynamics (QCD) theory of quarks and gluons. The solution of the QCD problem is solved by a lattice gauge theory formulated on a grid or lattice of points in space and time. The calculation of SU(3) operation and D-Slash in high dimensions are typical data dense tasks. In recent years, the SIMD architecture of Intel processor has been greatly improved, especially the wide length of AVX512 SIMD is easily available. Although SIMD parallel has been studied applied to lattice QCD, two basic problems have not been solved well. The first is that vectorization strongly depends byte length of SIMD implementation and leads to poor portability. The second is that what is the optimal data parallel algorithm for lattice QCD applications.

In this work, we has studied the data parallel computation for the lattice QCD application in SIMD speedup and a unified vectorization model is presented. The goal is to improve computational performance without the portability loss. We also discuss potential data parallelism for lattice QCD calculation. The programming test work is based on Intel processors, like Intel KNL, Intel Xeon Gold Skylake processor and current Intel Xeon Gold Cascade-lake processor. The parallel efficiency of test results can meet well theoretical expectation of performance improvement with the increase of SIMD byte length. This work also compares with the SIMD optimization of lattice QCD on the TaihuLight supercomputer, ranked first in Top500 list from Jun 2016 until Nov 2017. The talk will report the related experimental results and theoretical analysis.

Presentation materials