Speaker
Description
We propose a novel neural architecture that enforces an upper bound on the Lipschitz constant of the neural network (by constraining the norm of its gradient with respect to the inputs). This architecture was useful in developing new algorithms for the LHCb trigger which have robustness guarantees as well as powerful inductive biases leveraging the neural network’s ability to be monotonic in any subset of features. A new and interesting direction for this architecture is that it can also be used in the estimation of the Wasserstein metric (or the Earth Mover’s Distance) in optimal transport using the Kantorovich-Rubinstein duality. In this talk, I will describe how such architectures can be leveraged for developing new clustering algorithms using the Energy Mover’s Distance. Clustering using optimal transport generalizes all previous well-known clustering algorithms in HEP (anti-kt, Cambridge-Aachen, etc.) to arbitrary geometries and offers new flexibility in dealing with effects such as pile-up and unconventional topologies. I will also talk in detail about how this flexibility can be used to develop new algorithms which are more suitable for the Electron-Ion Collider setting than conventional ones.
References
NeurIPS physical sciences submission for the original architecture and its application to the LHCb trigger: https://ml4physicalsciences.github.io/2021/files/NeurIPS_ML4PS_2021_86.pdf
N.B.: This NeurIPS submission does not refer to any clustering or EMD-related applications.
Significance
This work proposes the use of a novel neural architecture for clustering and jet observable computations in the Energy Mover’s Distance framework. It reproduces conventional clustering algorithms and generalizes them to new ones which are more suitable for the Electron-Ion Collider (as an example.)