52
14

Generalized Chernoff Sampling for Active Testing, Active Regression and Structured Bandit Algorithms

Abstract

Active learning and structured stochastic bandit problems are intimately related to the classical problem of sequential experimental design. This paper studies active learning and best-arm identification in structured bandit settings from the viewpoint of active sequential hypothesis testing, a framework initiated by Chernoff (1959). We obtain a novel sample complexity bound for Chernoff's original active testing procedure by uncovering non-asymptotic terms that reduce in significance as the allowed error probability δ0\delta \rightarrow 0. Initially proposed for testing among finitely many hypotheses, we obtain the analogue of Chernoff sampling for the case when the hypotheses belong to a compact space. This allows us to directly apply it to active learning and structured bandit problems, where the unknown parameter specifying the arm means is often assumed to be an element of Euclidean space. Empirically, we demonstrate the potential of our proposed approach for active learning of neural network models and in linear and non-linear bandit settings, where we observe that our general-purpose approach compares favorably to state-of-the-art methods.

View on arXiv
Comments on this paper