The recent progress in parallel hardware architectures with deeper
vector pipelines or many-cores technologies brings opportunities for
HEP experiments to take advantage of SIMD and SIMT computing models.
Launched in 2013, the GeantV project studies performance gains in
propagating multiple particles in parallel, improving instruction
throughput and data locality in HEP event simulation.
One of challenges in developing highly parallel and efficient detector
simulation is the minimization of the number of conditional branches
or thread divergence during the particle transportation process.
Due to the complexity of geometry description and physics algorithms
of a typical HEP application, performance analysis is indispensable
in identifying factors limiting parallel execution.
In this report, we will present design considerations and computing
performance of GeantV physics models on coprocessors (Intel Xeon Phi
and NVidia GPUs) as well as on mainstream CPUs.
As the characteristics of these platforms are very different, it is
essential to collect profiling data with a variety of tools and to
analyze hardware specific metrics and their derivatives to be able
to evaluate and tune the performance.
We will also show how the performance of parallelized physics models
factorizes from the rest of GeantV event simulation.
|Primary Keyword (Mandatory)||Parallelization|
|Secondary Keyword (Optional)||High performance computing|