Speaker
Dr
Ricardo Vilalta
(University of Houston)
Description
Advances in statistical learning have placed at our disposal a rich set of
classification algorithms (e.g., neural networks, decision trees, Bayesian
classifiers, support vector machines, etc.) with little or no guidelines on how to
select the analysis technique most appropriate for the task at hand. In this paper we
present a new approach for the automatic selection of predictive models based on the
characteristics of the data under analysis. According to the particular data
distribution, our methodology may decide to choose a learning algorithm able to
delineate complex decision boundaries over the variable space (but exhibiting an
inevitable high variance), or rather instead use an algorithm less complex that
delineates coarse decision boundaries (but exhibiting a desirable low bias). Our
experimental analysis looks for the identification of stop1 signal at energy of 1.96
TeV. The problem is inherently difficult because of the existence of background data
with identical signal signatures. We report results using several metrics (e.g.,
accuracy, efficiency), and compare the performance of our methodology to a model
produced by a domain expert that separates manually signal events from background
events.
Author
Dr
Ricardo Vilalta
(University of Houston)
Co-authors
Dr
Paul Padley
(Rice University)
Dr
Pedrame Bargassa
(Rice University)