Roberto Valerio (Cinvestav Unidad Guadalajara)
Decision tree learning constitutes a suitable approach to classification due to its ability to partition the input (variable) space into regions of class-uniform events, while providing a structure amenable to interpretation (as opposed to other methods such as neural networks). But an inherent limitation of decision tree learning is the progressive lessening of the statistical support of the final classifier as clusters of single-class events are split on every partition, a problem known as the fragmentation problem. We describe a software system that measures the degree of fragmentation caused by a decision tree learner on every event cluster. Clusters are found through a decomposition of the data using a technique known as Spectral Clustering. Each cluster is analyzed in terms of the number and type of partitions induced by the decision tree. Our domain of application lies on the search for single top quark production, a challenging problem due to large backgrounds (similar to W+jets and tt¯ events), low energetic signals, and low number of jets. The output of the machine-learning software tool consists of a series of statistics describing the degree of classification error attributed to the fragmentation problem.
|Presentation type (oral | poster)||Oral|