Regret Analysis of the Anytime Optimally Confident UCB Algorithm

29 March 2016

Papers citing "Regret Analysis of the Anytime Optimally Confident UCB Algorithm"

4 / 4 papers shown

Title
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain Jianye Hao Tianpei Yang Hongyao Tang Chenjia Bai Jinyi Liu Zhaopeng Meng Peng Liu Zhen Wang OffRL 28 91 0 14 Sep 2021
KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints Aurélien Garivier Hédi Hadiji Pierre Menard Gilles Stoltz 13 32 0 14 May 2018
Learning the distribution with largest mean: two bandit frameworks E. Kaufmann Aurélien Garivier 17 19 0 31 Jan 2017
On Bayesian index policies for sequential resource allocation E. Kaufmann 23 84 0 06 Jan 2016