From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses

From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses

16 May 2022

Denis Belomestny

Pierre Menard

Papers citing "From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses"

10 / 10 papers shown

Title
Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models J.S. van Hulst W.P.M.H. Heemels D.J. Antunes OffRL 16 0 0 08 Apr 2025
Provably and Practically Efficient Adversarial Imitation Learning with General Function Approximation Tian Xu Zhilong Zhang Ruishuo Chen Yihao Sun Yang Yu 30 1 0 01 Nov 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent Yingru Li Jiawei Xu Lei Han Zhi-Quan Luo BDL OffRL 26 6 0 05 Feb 2024
Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization Carlos E. Luis A. Bottero Julia Vinogradska Felix Berkenkamp Jan Peters OffRL 31 3 0 07 Dec 2023
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo Haque Ishfaq Qingfeng Lan Pan Xu A. R. Mahmood Doina Precup Anima Anandkumar Kamyar Azizzadenesheli BDL OffRL 26 20 0 29 May 2023
Posterior Sampling for Deep Reinforcement Learning Remo Sasso Michelangelo Conserva Paulo E. Rauber OffRL BDL 35 6 0 30 Apr 2023
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms Denis Belomestny Pierre Menard A. Naumov D. Tiapkin Michal Valko 22 2 0 06 Apr 2023
A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms Dorian Baudry Kazuya Suzuki Junya Honda 29 4 0 10 Mar 2023
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees D. Tiapkin Denis Belomestny Daniele Calandriello Eric Moulines Rémi Munos A. Naumov Mark Rowland Michal Valko Pierre Menard 36 8 0 28 Sep 2022
UCB Momentum Q-learning: Correcting the bias without forgetting Pierre Menard O. D. Domingues Xuedong Shang Michal Valko 79 40 0 01 Mar 2021