Bandit Theory meets Compressed Sensing for high dimensional Stochastic
Linear Bandit
International Conference on Artificial Intelligence and Statistics (AISTATS), 2012
Abstract
We consider a linear stochastic bandit problem where the dimension of the unknown parameter is larger than the sampling budget . In such cases, it is in general impossible to derive sub-linear regret bounds since usual linear bandit algorithms have a regret in . In this paper we assume that is sparse, i.e. has at most non-zero components, and that the space of arms is the unit ball for the norm. We combine ideas from Compressed Sensing and Bandit Theory and derive algorithms with regret bounds in .
View on arXivComments on this paper
