Framework

Physics

Cluster attachment efficiency vs. fake rate for different network inputs and thresholds
- Attachment efficiency = (correctly attached cls NN / total cls) * (correctly attached cls NN / correctly attached cls GPU CF)

Network outperforms GPU CF under all thresholds and input sizes. Choice for threshold is determined by number of correctly attached clusters in the next plot
Significant benefits from using 3D networks
Number of correctly attached clusters

Threshold choice for classification network:
- <= 0.01: Almost no loss in number of correctly attached clusters
- >0.01 && <0.1: Maximum loss of 5% correctly attached clusters, but can lead to 18% savings in total clusters (see next plot)
Number of total clusters

CoG (pad) resolution as a function of occupancy for different network sizes (2 to 5 hidden layers; 16, 32, 64, 128 neurons per layer)

More layers work better
Network with L5 and N128 -> Not great performance, reason: Overtraining! Immediately visible in the logs. Validation loss goes up while training loss goes down / remains constant.
-> Improvement for the future: Save network at best training loss, best validation loss and network after all epochs are done.

A cool thing to look at... and more

Neural network loss landscape:

Take 1024 elements from the training data sample and perform PCA
- Finds the axes that maximise the variance when projecting the data onto it, i.e. the most relevant axes to describe the data
Take first two principal components and add them to the data as X_new = X + a*PCA1 + b*PCA2, where a and b are scale factors
Choose a regular grid for a and b of arbitrary size, calculate the loss for each grid point using the training data output and network output
Color = z = MSE loss; Smoothed with cubic splines, 400 grid points per direction

Interpretation: This loss landscape is (almost) convex! Networks are (almost) guaranteed to land in the global optimum! This makes the method reliable.

Addendum