8–12 Sept 2025
Hamburg, Germany
Europe/Berlin timezone

Parallel reconstruction profiling on multiple Hygon GPUs for ptychography in HEPS

Not scheduled
30m
Hamburg, Germany

Hamburg, Germany

Poster Track 1: Computing Technology for Physics Research Poster session with coffee break

Speaker

lei wang (Institute of High Energy Physics)

Description

When we try to move the software named ‘Hepsptycho’, which is a ptychography reconstruction program originally based on multiple Nvidia GPU and MPI techs, to run on the Hygon DCU architectures, we found that the reconstructed object and probe encountered an error while the results running on Nvidia GPUs are correct. We profiled the ePIE algorithm using NVIDIA Nsight Systems and Hygon's HIP-compatible profiler (Hipprof). Multiple GPUs will communicat and share with each other the object and probe information after the batch or iteration computation completes as slave GPUs send the reconstructed results back to GPU 0 using the Reduce or AllReduce function. Nvidia CUDA toolkit could successfully execute the communication. Hygon DCU 0 encounters a memory corruption error during synchroni-zation, likely due to race conditions when updating the object/probe buffers. We show the profiling results here and how we repair this bug. Here we also show the computational speedup using other HPC techs to get a better recon-struction performance on multi GPUs. This work is implemented within Institute of High Energy Physics (IHEP) DAISY framework.

Significance

This is the distributed computing method for ptycho reconstruction running on China-made GPU.

Author

lei wang (Institute of High Energy Physics)

Co-authors

FU Shiyuan fusy Dr Hao-Kai Sun (IHEP, CAS) Jianli Liu (IHEP) Dr Yaodong CHENG (Institute of High Energy Physics, Chinese Academy of Sciences) Yu Hu yangyang mu (The Institute of High Energy Physics of the Chinese Academy of Sciences) 齐法制 qifazhi

Presentation materials

There are no materials yet.