Anomaly Detection in the Copula Space

Tommaso Dorigo (Universita e INFN, Padova (IT))


A unsupervised learning tool that searches for localized, overdense regions of the copula space of a multidimensional feature space is discussed. The algorithm, named RanBox, exists in two versions - one which searches multiple times in random subspaces (typically of 8 to 12 dimensions) of the feature space, and a second one (RanBoxIter) which iteratively adds dimensions to the searched space. Gradient descent is used to localize the multi-dimensional interval which maximizes a suitable test statistic proportional to the significance of the observed data in the box. Applications to UCI datasets from fundamental physics and from fraud detection are discussed.

