Mar 21 – 27, 2009
Europe/Prague timezone

The Effect of the Fragmentation Problem in Decision Tree Learning Applied to the Search for Single Top Quark Production

Mar 24, 2009, 8:00 AM


Prague Congress Centre 5. května 65, 140 00 Prague 4, Czech Republic
Board: Tuesday 045
poster Event Processing Poster session


Roberto Valerio (Cinvestav Unidad Guadalajara)


Decision tree learning constitutes a suitable approach to classification due to its ability to partition the input (variable) space into regions of class-uniform events, while providing a structure amenable to interpretation (as opposed to other methods such as neural networks). But an inherent limitation of decision tree learning is the progressive lessening of the statistical support of the final classifier as clusters of single-class events are split on every partition, a problem known as the fragmentation problem. We describe a software system that measures the degree of fragmentation caused by a decision tree learner on every event cluster. Clusters are found through a decomposition of the data using a technique known as Spectral Clustering. Each cluster is analyzed in terms of the number and type of partitions induced by the decision tree. Our domain of application lies on the search for single top quark production, a challenging problem due to large backgrounds (similar to W+jets and tt¯ events), low energetic signals, and low number of jets. The output of the machine-learning software tool consists of a series of statistics describing the degree of classification error attributed to the fragmentation problem.
Presentation type (oral | poster) Oral

Primary authors

Ricardo Vilalta (Department of Computer Science University of Houston) Roberto Valerio (Cinvestav Unidad Guadalajara)


Francisco Ocegueda-Hernandez (Department of Computer Science University of Houston) Gordon Watts (Department of Physics University of Washington)

Presentation materials