Sep 24 – 27, 2019
CERN
Europe/Zurich timezone

Optimizing Beyond Vectorization and Parallelization: a Case Study on QMCPACK

Sep 25, 2019, 9:45 AM
30m
80/1-001 - Globe of Science and Innovation - 1st Floor (CERN)

80/1-001 - Globe of Science and Innovation - 1st Floor

CERN

60
Show room on map

Speakers

Cédric Valensi (UVSQ / ECR) William Jalby (UVSQ / ECR)

Description

QMCPACK, a scalable quantum Monte Carlo package (QMC), has been highly optimized for the latest high end microprocessors: arrays and loops have been restructured to get high vectorization ratios, parallelism is easily and efficiently exploited through the MC nature of the algorithm and finally a lot of attention has been paid to use highly tuned MKL libraries. Identifying optimization opportunities and techniques in such a code are challenging. In this talk, we report performance gains (around 15%) can be obtained by using tools which provide non standard views on the code behavior: for example, performing a detailed assessment of the code quality beyond standard vectorization, analyzing accurately the impact on performance of data access and exploring automatically multiple parallel configurations. This improvement is directly translated into energy saving and increased productivity of QMC which consumes a significant fraction of leadership computing resources, such as ALCF's Theta KNL cluster. Also presented are the various tools used and how they provided us with key insights to improve QMCPACK performance.

Presentation materials