Speaker
Description
Neural networks are a powerful tool for an ever-growing list of tasks. However, their enormous complexity often complicates developing theories describing how these networks learn. In our recent work, inspired by the development of statistical mechanics, we have studied the use of collective variables to explain how neural networks learn, specifically, the von Neumann entropy and Trace of the empirical neural tangent kernel (NTK). We show that the entropy and trace of the NTK at the start of training can indicate the diversity of the training data and even predict the quality of the model after training. Further work investigates the application of these variables to understand network dynamics better, exploring optimizers for better training and the construction of better network architectures.