Eric:
Investigation on CUDA launch strategies showed that calling CUDA runtime from a single thread is a must for performance. Kernel launch time is 5x longer with multiple threads and host function launch even 100x longer. I'll prepare the full presentation for next week,
Oliver:
Summer Student
- ALICE summer student has arrived: Milla Bramsted
- She is working on benchmarking SoA code on GPUs.
- We will track her project in this google doc.
- She will add CUDA kernels to this repo.
ALICE O2 CI-Pipelines on NGT Cluster
- A fork of the AliceO2 repo is not in the NextGenTrigggers (NGT) GitHub organization.
- It has a GitHub action running the standalone benchmark on NGT GPUs (H100).
- Uses the builds in /cvmfs/alice.cern.ch/ of O2 and dependencies (pipeline takes about 7 minutes).
