Downsampling
Did the downsampling across row and pad dimension via histogram binning. Afterwards, downsampling via modified Tsallis distribution for clusters along pT
For some reason training of NN suddenly resulted in exploding gradients (after downsampling). Could be that there are nans / infs in the training data -> investigating...
Code implementation
Implementation in GPU kernel code now working (although just for the CPU version for now, still need the framework for GPU application). But implementation is done exactly as a usual GPU kernel, so this is very promising. Initial test on 1Ev 50kHz PbPb seem like there is no bottleneck by the NN: Current implementation is one instance of the NN for every sector, executed single threaded. Data generation happens one instance at a time, so we don't need to accumulate the clusters first and then execute (saves significant amount of memory).
Clusters currently are not getting published, not sure why, but this is an easy problem to fix...