Linear Bandits on Uniformly Convex Sets

Abstract
Linear bandit algorithms yield pseudo-regret bounds on compact convex action sets and two types of structural assumptions lead to better pseudo-regret bounds. When is the simplex or an ball with , there exist bandits algorithms with pseudo-regret bounds. Here, we derive bandit algorithms for some strongly convex sets beyond balls that enjoy pseudo-regret bounds of , which answers an open question from [BCB12, \S 5.5.]. Interestingly, when the action set is uniformly convex but not necessarily strongly convex, we obtain pseudo-regret bounds with a dimension dependency smaller than . However, this comes at the expense of asymptotic rates in varying between and .
View on arXivComments on this paper