29 November 2021 to 3 December 2021
Virtual and IBS Science Culture Center, Daejeon, South Korea
Asia/Seoul timezone

Speeding up lattice QCD simulation on Arm architectures with multigrid algorithm

contribution ID 715
Not scheduled
20m
Raspberry (Gather.Town)

Raspberry

Gather.Town

Poster Track 3: Computations in Theoretical Physics: Techniques and Methods Posters: Raspberry

Speaker

Dr Wei Sun (Institute of High Energy Physics, Chinese Academy of Sciences)

Description

Lattice quantum chromodynamics (lattice QCD) is the non-perturbative definition of the QCD theory from first principle and can be systematically improved, meanwhile, it is one of the most important high performance computing application in high energy physics. The physics research of lattice QCD benefited enormously from the development of computer hardware and algorithm, and particle physicist can now study strong interaction at physical pion mass, smaller lattice spacing and larger volume. On the other hand, with the emerging of supercomputers and clusters based on Arm architectures, it is necessary to implement the algorithm and conduct the simulation on these machines to maximize the utilization of the limited computing resources. In this paper, we implemented and ported the multigrid (MG) algorithm, the state of the art algorithm to solve the sparse linear Dirac equation $Dx=b$ on the lattice, on a cluster with Arm based processors for clover fermion. We then use the MG algorithm in the distillation method to compute the perambulators and gain ~3.6x speed up compared to the conventional BiCGStab algorithm, on a gauge ensemble with lattice volume $16^3\times 128$ and pion mass $m_\pi=350$ MeV. With this development, we can now use the Arm cluster more efficiently and we expect further speed up when approaching to the near physical pion mass.

Significance

The distillation method is one of the most important method to compute the disconnected diagram and is widely used in lattice QCD, while there needs lots of computing time for the calculation. For a $16^3\times 128$ lattice, $m_\pi=350$ MeV and 70 Laplace eigenvectors, one need to solve the sparse Dirac equation $Dx=b$ for 128x4x70=35840 times on every gauge configuration, which is fairly time consuming with conventional BiCGStab like algorithm, while it is suitable for the multigrid algorithm because we can reuse the same setup subspace among these computations. We get a ~3.6x speed up compared to BiCGStab even including the MG setup time. This improves the simulation time on Arm based processors significantly and we expect to gain more as we approaching to the physical pion mass, and accelerate the physics research thereafter.

Speaker time zone Compatible with Asia

Primary author

Dr Wei Sun (Institute of High Energy Physics, Chinese Academy of Sciences)

Co-author

Dr Yujiang Bi (Institute of High Energy Physics, Chinese Academy of Sciences)

Presentation materials