27 September 2004 to 1 October 2004
Interlaken, Switzerland
Europe/Zurich timezone

A statistical toolkit for data analysis

30 Sep 2004, 17:30
20m
Jungfrau (Interlaken, Switzerland)

Jungfrau

Interlaken, Switzerland

oral presentation Track 2 - Event processing Event Processing

Speaker

M.G. Pia (INFN GENOVA)

Description

Statistical methods play a significant role throughout the life- cycle of HEP experiments, being an essential component of physics analysis. We present a project in progress for the development of an object-oriented software toolkit for statistical data analysis. More in particular, the Statistical Comparison component of the toolkit provides algorithms for the comparison of data distributions in a variety of use cases typical of HEP experiments, as regression testing (in various phases of the software life-cycle), validation of simulation through comparison to experimental data, comparison of expected versus reconstructed distributions, comparison of data from different sources - such as different sets of experimental data, or experimental with respect to theoretical distributions. The toolkit contains a variety of goodness-of-fit tests, from chi-squared to Kolmogorov-Smirnov, to less known, but generally much more powerful tests such as Anderson-Darling, Cramer-von Mises, Kuiper, Tiku etc. Thanks to the component-based design and the usage of the standard AIDA interfaces, this tool can be used by other data analysis systems or integrated in experimental software frameworks. We present the architecture of the system, the statistics methods implemented and some results of its applications to the comparison of Geant4 simulations with respect to experiment.

Primary authors

A. Pfeiffer (CERN) A. Ribon (CERN) B. Mascialino (INFN Genova) M.G. Pia (INFN GENOVA) P. VIARENGO (IST, Genova, Italy) S. Donadio (INFN Genova) S. Guatelli (INFN Genova, Italy)

Presentation Materials