408
v1v2v3v4v5 (latest)

Disagreement-based Active Learning in Online Settings

Abstract

We study online active learning for classifying streaming instances within the framework of statistical learning theory. At each time, the learner either queries the label of the current instance or predicts the label based on past seen examples. The objective is to minimize the number of queries while constraining the number of prediction errors over a horizon of length TT. We develop a disagreement-based online learning algorithm for a general hypothesis space and under the Tsybakov noise. We show that the proposed algorithm has a label complexity of O(dT22α2αlog2T)O(dT^{\frac{2-2\alpha}{2-\alpha}}\log^2 T) under a constraint of bounded regret in terms of classification errors, where dd is the VC dimension of the hypothesis space and α\alpha is the Tsybakov noise parameter. We further establish a matching (up to a poly-logarithmic factor) lower bound, demonstrating the order optimality of the proposed algorithm. We address the tradeoff between label complexity and regret and show that the algorithm can be modified to operate at a different point on the tradeoff curve.

View on arXiv
Comments on this paper