Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1607.00215
Cited By
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
1 July 2016
Ian Osband
Benjamin Van Roy
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Why is Posterior Sampling Better than Optimism for Reinforcement Learning?"
50 / 61 papers shown
Title
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
92
1
0
29 Apr 2025
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model
Moritz A. Zanger
Pascal R. van der Vaart
Wendelin Bohmer
M. Spaan
UQCV
BDL
149
0
0
14 Mar 2025
Do ImageNet-trained models learn shortcuts? The impact of frequency shortcuts on generalization
Shunxin Wang
Raymond N. J. Veldhuis
N. Strisciuglio
VLM
71
0
0
05 Mar 2025
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning
Siddharth Aravindan
Dixant Mittal
Wee Sun Lee
BDL
79
0
0
17 Jan 2025
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Jasmine Bayrooti
Carl Henrik Ek
Amanda Prorok
42
0
0
07 Oct 2024
NeoRL: Efficient Exploration for Nonepisodic RL
Bhavya Sukhija
Lenart Treven
Florian Dorfler
Stelian Coros
Andreas Krause
OffRL
30
0
0
03 Jun 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li
Jiawei Xu
Lei Han
Zhi-Quan Luo
BDL
OffRL
26
6
0
05 Feb 2024
Posterior Sampling-based Online Learning for Episodic POMDPs
Dengwang Tang
Dongze Ye
Rahul Jain
A. Nayyar
Pierluigi Nuzzo
OffRL
51
0
0
16 Oct 2023
Provably Efficient Exploration in Constrained Reinforcement Learning:Posterior Sampling Is All You Need
Danil Provodin
Pratik Gajane
Mykola Pechenizkiy
M. Kaptein
33
0
0
27 Sep 2023
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo
Haque Ishfaq
Qingfeng Lan
Pan Xu
A. R. Mahmood
Doina Precup
Anima Anandkumar
Kamyar Azizzadenesheli
BDL
OffRL
28
20
0
29 May 2023
Bayesian Reinforcement Learning with Limited Cognitive Load
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
OffRL
34
8
0
05 May 2023
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms
Denis Belomestny
Pierre Menard
A. Naumov
D. Tiapkin
Michal Valko
22
2
0
06 Apr 2023
Model-Based Uncertainty in Value Functions
Carlos E. Luis
A. Bottero
Julia Vinogradska
Felix Berkenkamp
Jan Peters
36
13
0
24 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Mengdi Wang
Furong Huang
Dinesh Manocha
24
7
0
28 Jan 2023
CIM: Constrained Intrinsic Motivation for Sparse-Reward Continuous Control
Xiang Zheng
Xingjun Ma
Cong Wang
28
1
0
28 Nov 2022
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
28
4
0
30 Oct 2022
Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction
Dilip Arumugam
Satinder Singh
24
3
0
30 Oct 2022
Opportunistic Episodic Reinforcement Learning
Xiaoxiao Wang
Nader Bouacida
Xueying Guo
Xin Liu
14
0
0
24 Oct 2022
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
Haotian Ye
Xiaoyu Chen
Liwei Wang
S. Du
OffRL
24
6
0
19 Oct 2022
Square-root regret bounds for continuous-time episodic Markov decision processes
Xuefeng Gao
X. Zhou
43
6
0
03 Oct 2022
Ensemble Reinforcement Learning in Continuous Spaces -- A Hierarchical Multi-Step Approach for Policy Training
Gang Chen
Victoria Huang
OffRL
31
0
0
29 Sep 2022
POEM: Out-of-Distribution Detection with Posterior Sampling
Yifei Ming
Ying Fan
Yixuan Li
OODD
29
113
0
28 Jun 2022
Between Rate-Distortion Theory & Value Equivalence in Model-Based Reinforcement Learning
Dilip Arumugam
Benjamin Van Roy
OffRL
35
1
0
04 Jun 2022
Exploration in Deep Reinforcement Learning: A Survey
Pawel Ladosz
Lilian Weng
Minwoo Kim
H. Oh
OffRL
23
324
0
02 May 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
28
21
0
24 Mar 2022
Online Learning of Energy Consumption for Navigation of Electric Vehicles
Niklas Åkerblom
Yuxin Chen
M. Chehreghani
20
12
0
03 Nov 2021
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
17
20
0
01 Nov 2021
Reinforcement Learning in Reward-Mixing MDPs
Jeongyeol Kwon
Yonathan Efroni
C. Caramanis
Shie Mannor
30
15
0
07 Oct 2021
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning
Tong Zhang
22
63
0
02 Oct 2021
Greedy UnMixing for Q-Learning in Multi-Agent Reinforcement Learning
Chapman Siu
Jason M. Traish
R. Xu
25
2
0
19 Sep 2021
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
21
80
0
01 Sep 2021
Bayesian decision-making under misspecified priors with applications to meta-learning
Max Simchowitz
Christopher Tosh
A. Krishnamurthy
Daniel J. Hsu
Thodoris Lykouris
Miroslav Dudík
Robert Schapire
17
49
0
03 Jul 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
30
15
0
15 Jun 2021
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
30
70
0
06 Mar 2021
Adaptive Transmission Scheduling in Wireless Networks for Asynchronous Federated Learning
Hyun-Suk Lee
Jang-Won Lee
81
53
0
02 Mar 2021
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
219
413
0
16 Feb 2021
Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
Ying Fan
Yifei Ming
25
17
0
20 Nov 2020
Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits
Siwei Wang
Longbo Huang
John C. S. Lui
OffRL
16
38
0
05 Nov 2020
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration
Priyank Agrawal
Jinglin Chen
Nan Jiang
27
18
0
23 Oct 2020
Randomized Value Functions via Posterior State-Abstraction Sampling
Dilip Arumugam
Benjamin Van Roy
OffRL
28
7
0
05 Oct 2020
An Online Learning Framework for Energy-Efficient Navigation of Electric Vehicles
Niklas Åkerblom
Yuxin Chen
M. Chehreghani
16
15
0
03 Mar 2020
Concentration Inequalities for Multinoulli Random Variables
Jian Qian
Ronan Fruit
Matteo Pirotta
A. Lazaric
11
21
0
30 Jan 2020
Making Sense of Reinforcement Learning and Probabilistic Inference
Brendan O'Donoghue
Ian Osband
Catalin Ionescu
OffRL
22
47
0
03 Jan 2020
Convergence Rates of Posterior Distributions in Markov Decision Process
Zhen Li
E. Laber
13
0
0
22 Jul 2019
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
11
71
0
12 Jun 2019
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
Yonathan Efroni
Nadav Merlis
Mohammad Ghavamzadeh
Shie Mannor
OffRL
22
67
0
27 May 2019
A Bayesian Approach to Robust Reinforcement Learning
E. Derman
D. Mankowitz
Timothy A. Mann
Shie Mannor
18
57
0
20 May 2019
Meta reinforcement learning as task inference
Jan Humplik
Alexandre Galashov
Leonard Hasenclever
Pedro A. Ortega
Yee Whye Teh
N. Heess
OffRL
29
127
0
15 May 2019
A Short Survey on Probabilistic Reinforcement Learning
R. Russel
13
2
0
21 Jan 2019
1
2
Next