Aggressive Sampling for Multi-class to Binary Reduction with
Applications to Text Classification
Neural Information Processing Systems (NeurIPS), 2017
- MQ

Abstract
We address the problem of multiclass classification in the case where the number of classes is very large. We propose a multiclass to binary reduction strategy, in which we transform the original problem into a binary classification one over pairs of examples. We derive generalization bounds for the error of the classifier of pairs using local Rademacher complexity, and a double sampling strategy (in the terms of examples and classes) that speeds up the training phase while maintaining a very low memory usage. Experiments are carried for text classification on DMOZ and Wikipedia collections with up to 20,000 classes in order to show the efficiency of the proposed method.
View on arXivComments on this paper
