Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit

Abstract
We consider a linear stochastic bandit problem where the dimension of the unknown parameter is larger than the sampling budget . In such cases, it is in general impossible to derive sub-linear regret bounds since usual linear bandit algorithms have a regret in . In this paper we assume that is sparse, i.e. has at most non-zero components, and that the space of arms is the unit ball for the norm. We combine ideas from Compressed Sensing and Bandit Theory and derive algorithms with regret bounds in .
View on arXivComments on this paper