Category Trees

6 November 2018

Abstract

This paper presents a batch classifier that has been improved from the earlier version and fixed a mistake in the earlier paper. Two important changes have been made. Each category is represented by a classifier, where each classifier classifies its own subset of data rows, using batch input values to represent the centroid. The first change is to use the category centroid as the desired category output. When the classifier represents more than one category, it creates a new layer and splits, to represent each category separately in the new layer. The second change therefore, is to allow the classifier to branch to new levels when there is a split in the data, or when some data rows are incorrectly classified. Each layer can therefore branch like a tree - not for distinguishing features, but for distinguishing categories. The paper then suggests further innovations, by adding fixed value ranges through bands, for each column or feature of the input dataset. When considering features, it is shown that some of the data can be classified directly through fixed value ranges, while the rest can be classified using the classifier technique. Tests show that the method can successfully classify a diverse set of benchmark datasets to better than the state-of-the-art. The paper also discusses a biological analogy with neurons and neuron links.

View on arXiv

Comments on this paper