In the era of the High-Luminosity Large Hadron Collider (HL-LHC), one of the most computationally challenging problems is expected to be finding and fitting particle tracks during event reconstruction. The algorithms currently in use at the LHC are based on Kalman filter techniques, which are known to be robust and provide good physics performance. Given the need for improved computational performance, we explore Kalman-filter-based methods for track finding and fitting that are specially adapted for many-core SIMD and SIMT architectures, since processors of this type are becoming increasingly dominant in high-performance hardware.
For both track fitting and track building, our adapted Kalman filter software has obtained significant parallel speedups on Intel Xeon, Intel Xeon Phi, Intel Xeon, and (to a limited degree) NVIDIA GPUs. Results from our prior reports, however, were more focused on simulations of artificial events taking place inside an idealized barrel detector composed of concentric cylinders. In the current work, we shift focus to CMSSW-generated events taking place inside a geometrically accurate representation of the CMS-2017 tracker. To a large extent, the approaches that were previously developed for the idealized geometry have carried over to the more accurate case. For instance, groups of candidate tracks are still propagated to the average radius (or average axial distance) of the next detector layer; once the matching hits in that layer have been identified, candidate tracks are re-propagated to the exact hit locations and tested for viability. Special treatment is given to the overlap or transition region between barrel and endcaps, so that matching hits can be picked up from either area as required.
We summarize the key features of this software, including (1) the data structures and code constructs that facilitate vectorization and SIMT, and (2) the multiple levels of parallel loops that have been multithreaded using TBB. We demonstrate that, as compared to CMSSW, the present Kalman filter implementation is able to reconstruct events with comparable physics performance and generally better computational performance. The status of, and plans for, the software are discussed.