Adaptive KL-UCB based Bandit Algorithms for Markovian and i.i.d.
SettingsIEEE Transactions on Automatic Control (TAC), 2020 |
A Hoeffding Inequality for Finite State Markov Chains and its
Applications to Markovian BanditsInternational Symposium on Information Theory (ISIT), 2020 |