Speaker
M.G. Pia
(INFN GENOVA)
Description
A Toolkit for Statistical Data Analysis has been recently released.
Thanks to this novel software system, for the first time an ample
set of sophisticated algorithms for the comparison of data
distributions (goodness of fit tests) is made available to the High
Energy Physics community in an open source product. The statistical
algorithms implemented belong to two sets, for the comparison of
binned and unbinned distributions respectively; they include the Chi-
squared Test, the Kolmogorov-Smirnov Test, the Kuiper Test, the
Goodman Test, the Anderson-Darling Test, the Fisz-Cramer-von Mises
test, the Tiku Test.
Since the Toolkit provides the user a wide choice of algorithms, it
is important to evaluate them comparatively and to estimate their
power, to provide guidance to the users about the selection of the
most appropriate algorithm for a given use case.
We present a study of the power of a variety of mathematical
algorithms implemented in the Toolkit. The study is performed by
evaluating the behaviour of the various tests in a set of well
identified use cases relevant to data analysis applications. To our
knowledge, such a comparative study of the power of goodness of fit
algorithms has never been performed previously.
Primary authors
A. Pfeiffer
(CERN)
A. Ribon
(CERN)
B. Mascialino
(INFN Genova)
M.G. Pia
(INFN GENOVA)
S. Donadio
(INFN Genova)
S. Guatelli
(INFN Genova, Italy)