An incremental linear-time learning algorithm for the Optimum-Path Forest classifier

We present a classification method with linear-time incremental capabilities based on the Optimum-Path Forest (OPF). The OPF considers instances as nodes of a fully-connected training graph, where the edges' weights are the distances between two nodes' feature vectors. Upon this graph, a minimum spanning tree is built, and every edge connecting instances of different classes is removed, with those nodes becoming prototypes or roots of a tree. Those are called optimum-path trees. In this paper we describe a new algorithm with incremental capabilities by inserting new instances into one of the existing trees; substituting the prototype of a tree; or splitting a tree. This incremental method was tested for accuracy and running time against full retraining using the original OPF and a the Differential Image Foresting Transform. As a result, our algorithm includes new instances in linear-time, while keeping similar accuracies when compared with the original model, which runs in quadratic-time.
View on arXiv