Speaker
Description
Neural networks, and recently, specifically deep neural networks, are attractive candidates for machine learning problems in high energy physics because they can act as universal approximators. With a properly defined objective function and sufficient training data, neural networks are capable of approximating functions for which physicists lack sufficient insight to derive an analytic, closed-form solution. There are, however, a number of challenges that can prevent a neural network from achieving a sufficient approximation of the desired function. One of the chief challenges is that there is currently no fundamental understanding of the size—both in terms of number of layers and number of nodes per layer—necessary to approximate any given function. Networks that are too small are doomed to fail, and networks that are too large often encounter problems converging to an acceptable solution or develop issues with overtraining. In an attempt to gain some intuition, we performed a study of neural network approximations of functions known to be relevant to high energy physics, such as calculating the invariant mass from momentum four-vector components, or calculating the momentum four-vector vector of a parent particle from the four-vectors of its decay products. We report on the results of those studies and discuss possible future directions.