Speaker
Description
In this presentation, we consider how a physics application may be restructured to take better advantage of vectorization and multithreading. For vectorization, we focus on the Matriplex concept that is used to implement parallel Kalman filtering in our collaboration's particle tracking R&D project called mkFit. Drastic changes to data structures and loops were required to help the compiler find the SIMD opportunities in the algorithm. For multithreading, we examine how binning detector hits and tracks in an abstraction of the detector geometry enabled track candidates to be processed in bunches. We conclude by looking at how Intel VTune and Advisor, together with simple test codes, played a role in identifying and resolving trouble spots that affected performance. The mkFit code is now part of the production software for CMS in LHC Run 3.