Speaker
Description
SIMD acceleration can potentially boost by factors the application throughput. However, achieving efficient SIMD vectorization for scalar code with complex data flow and branching logic, goes way beyond breaking loop dependencies and relying on the compiler. Since the re-factoring effort scales with the number of lines of code, it is important to understand what kind of performance gains can be expected in such complex cases. The GeantV R&D has started a couple of years ago a top to bottom vectorization approach to particle transport simulation. Percolating multiple data to algorithms was mandatory since not all the components offer natural internal vectorization capability. Vectorizing low-level algorithms such as position/direction geometry classifiers or field propagators were certainly necessary, but not sufficient to achieve relevant SIMD gains. Overheads for maintaining the concurrent vector data flow and data copying had to be minimized. GeantV developed a framework to allow different categories of scalar and vectorized components to co-exist, dealing with data flow management and real-time heuristic optimizations. The paper will describe our approach on co-ordinating SIMD vectorization at framework level, making a detailed quantitative analysis of the SIMD gain versus overheads with a break-down by component in terms of geometry, physics and magnetic field propagation. The more general context of the GeantV work and goals for 2018 will also be presented.