-> With both fixes: Clusterizer now works in OpenCL
Lots of helper function written for compile time recursion, e.g.:
GPUd() static constexpr uint32_t factorial(const uint32_t n) { return (n == 0) || (n == 1) ? 1 : n * factorial(n - 1); }
Fine for CUDA / HIP, but OpenCL C++ prohibits recursion. Trivial to fix in C++20 with consteval but OpenCL C++ still based on C++17, so use templates instead for recursion: https://github.com/AliceO2Group/AliceO2/pull/14462
-> Waiting for feedback from TPC
PoCL rejects recursive functions, but only tells you where the function is called, not which recursive function it found. E.g. you get errors like this:
Recursion detected in function: '_ZNU3AS42o23gpu7GPUdEdx11fillClusterEffihffRU3AS4KNS0_23GPUCalibObjectsTemplateINS0_8ConstPtrEEEfff'
Made debugging issue in TPCFastTransform very tedious.
Patched PoCL to demangle C++ symbols + print infringing function. So error now look like this:
Recursion detected in function: 'o2::gpu::GPUdEdx::fillCluster(float, float, int, unsigned char, float, float, o2::gpu::GPUCalibObjectsTemplate<o2::gpu::ConstPtr> const&, float, float, float)'-> Infriging function: 'o2::gpu::MultivariatePolynomialParametersHelper::factorial(unsigned int)'
(Side note: LLVM demangler can't demangle OpenCL symbols because of address space qualifiers in the mangled name. Have the form 'U1AS1'-> Hack: prune qualifiers before demangling...)
Couple of other issues that make debugging harder in PoCL:
-cl-opt-disable to PoCL compiler -> crash in clusterizer kernel