Framework
- Added GPU timer to ONNX inference for profiling
- Added deconvolution flags to NN inference for exact matching with GPU CF
Physics
- Cluster attachment efficiency vs. fake rate for different network inputs and thresholds
- Attachment efficiency = (correctly attached cls NN / total cls) * (correctly attached cls NN / correctly attached cls GPU CF)

- Network outperforms GPU CF under all thresholds and input sizes. Choice for threshold is determined by number of correctly attached clusters in the next plot
- Significant benefits from using 3D networks
- Number of correctly attached clusters

- Threshold choice for classification network:
- <= 0.01: Almost no loss in number of correctly attached clusters
- >0.01 && <0.1: Maximum loss of 5% correctly attached clusters, but can lead to 18% savings in total clusters (see next plot)
- Number of total clusters

- CoG (pad) resolution as a function of occupancy for different network sizes (2 to 5 hidden layers; 16, 32, 64, 128 neurons per layer)

- More layers work better
- Network with L5 and N128 -> Not great performance, reason: Overtraining! Immediately visible in the logs. Validation loss goes up while training loss goes down / remains constant.
-> Improvement for the future: Save network at best training loss, best validation loss and network after all epochs are done.
A cool thing to look at... and more
Neural network loss landscape:
- Take 1024 elements from the training data sample and perform PCA
- Finds the axes that maximise the variance when projecting the data onto it, i.e. the most relevant axes to describe the data
- Take first two principal components and add them to the data as X_new = X + a*PCA1 + b*PCA2, where a and b are scale factors
- Choose a regular grid for a and b of arbitrary size, calculate the loss for each grid point using the training data output and network output
- Color = z = MSE loss; Smoothed with cubic splines, 400 grid points per direction


- Interpretation: This loss landscape is (almost) convex! Networks are (almost) guaranteed to land in the global optimum! This makes the method reliable.
Addendum
