ACAT 2008

Name: ACAT 2008
Start: 2008-11-03T08:00:00+01:00
End: 2008-11-07T18:00:00+01:00
Location: Ettore Majorana Foundation and Centre for Scientific Culture

3–7 Nov 2008

Ettore Majorana Foundation and Centre for Scientific Culture

Europe/Zurich timezone

Support

acat2008@cern.ch

A Numeric Comparison of Feature Selection Algorithms for Supervised Learning

5 Nov 2008, 14:00

25m

Ettore Majorana Foundation and Centre for Scientific Culture

Via Guarnotta, 26 - 91016 ERICE (Sicily) - Italy Tel: +39-0923-869133 Fax: +39-0923-869226 E-mail: hq@ccsem.infn.it

Parallel Talk 2. Data Analysis Data Analysis - Algorithms and Tools

Dr Giulio Palombo (University of Milan - Bicocca)

Datasets in modern High Energy Physics (HEP) experiments are often described by dozens or even hundreds of input variables (features). Reducing a full feature set to a subset that most completely represents information about data is therefore an important task in analysis of HEP data. We compare various feature selection algorithms for supervised learning using several datasets such as, for instance, imaging gamma-ray Cherenkov telescope (MAGIC) data found at the UCI repository. We use classifiers and feature selection methods implemented in the statistical package StatPatternRecognition (SPR), a free open-source C++ package developed in the HEP community (http://sourceforge.net/projects/statpatrec/). For each dataset, we select a powerful classifier and estimate its learning accuracy on feature subsets obtained by various feature selection algorithms. When possible, we also estimate the CPU time needed for the feature subset selection. The results of this analysis are compared with those published previously for these datasets using other statistical packages such as R and Weka. We show that the most accurate, yet slowest, method is a wrapper algorithm known as generalized sequential forward selection ("Add N Remove R") implemented in SPR.

Dr Giulio Palombo (University of Milan - Bicocca)

Dr Ilya Narsky (California Institute of Technology)

Slides

ACAT-08-palombo.pdf

ACAT 2008

Support

A Numeric Comparison of Feature Selection Algorithms for Supervised Learning

Ettore Majorana Foundation and Centre for Scientific Culture

Speaker

Description

Author

Co-author

Presentation materials

Choose timezone

ACAT 2008

Support

Speaker

Description

Author

Co-author

Presentation materials