ALICE, one of the four large experiments at CERN LHC, is a detector for the physics of heavy ions. In a high interaction rate environment, the pile-up of multiple events leads to an environment that requires advanced multidimensional data analysis methods.
Machine learning (ML) has become popular in multidimensional data analysis in recent years. Compared to the simple, low-dimensional analytical approaches used in the past, it is more difficult to interpret machine learning models and evaluate their uncertainties. On the other hand, oversimplification and reduction of dimensionality in the analysis lead to explanations becoming more complex or wrong.
Our goal was to provide a tool for dealing with NDimensional problems, to simplify data analysis in many (optimally all relevant) dimensions, to fit and visualize N-dimensional functions including their uncertainties and biases, to validate assumptions and approximations, to define multidimensional "invariant" functions/alarms.
RootInteractive is a general-purpose tool for multidimensional statistical analysis. We use a declarative programming paradigm, where we build the structure and elements of computer programs and express the logic of a computation without describing its control flow. This approach makes it easy to use for domain experts, students and educators. RootInteractive provides functions for interactive, easily configurable visualization of unbinned and binned data, interactive n-dimensional histogramming/projection, and derived aggregate information extraction on the server (Python/C++) and client (Javascript). We support client/server applications using Jupyter, or we can create a stand-alone client-side application/dashboard.
Using a combination of lossy and lossless data compression, datasets with, for example, O(10^7) entries x O(10-50) attributes can be analyzed interactively in the standalone application in the O(500 MBy) browser. By applying a suitable representative down-sampling O(10^-2-10^-3) and subsequent re-weighting or pre-aggregation on the server or batch farm, the effective monthly/annual statistics ALICE can be analyzed interactively in many dimensions for calibration/reconstruction validation/QA/QC or statistical/physical analysis.