ITS GPU tracking
- General priorities:
- Focusing on porting all of what is possible on the device, extending the state of the art, and minimising computing on the host.
- Optimizations via intelligent scheduling and multi-streaming can happen right after.
- Kernel-level optimisations to be investigated.
- Neighbour finding ported: (#13636)
- We have our "first" non-deterministic effect due to the ranking of the Cells subject to concurrency ~per-mille.
- Deterministic mode implies now <<<1,1>>> settings to restore 1:1 coherence with CPU.
- Will leave it as it is for now; the overall impact and possible action will be evaluated once the porting is fully finalised.
- Cell finding ported: (#13653)
- no concurrence: deterministic out of the box.
- PbPb: 1.5x faster than CPU already, with single stream serialization+deterministic flags, factors more expected by multi-streaming on the different layers.
- TODO:
- Tracklet finding to be ported back: work in progress right now.
- Reproducer for HIP bug on multi-threaded track fitting: no progress yet.
- Move more of the track-finding tricky steps on GPU: no progress yet.
- Fix possible execution issues and known discrepancies when using
gpu-reco-workflow
: no progress yet; will start after the tracklet finding is ported.
DCAFitterGPU
- Deterministic approach via using
SMatrixGPU
on the host, under particular configuration: no progress yet.