Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1807.09647
Cited By
v1
v2
v3
v4 (latest)
Variational Bayesian Reinforcement Learning with Regret Bounds
25 July 2018
Brendan O'Donoghue
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Variational Bayesian Reinforcement Learning with Regret Bounds"
20 / 20 papers shown
Title
IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic
Stefano Viel
Luca Viano
Volkan Cevher
203
1
0
27 Feb 2025
Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games
Tong Yang
Bo Dai
Lin Xiao
Yuejie Chi
OffRL
138
2
0
13 Feb 2025
Risk-sensitive control as inference with Rényi divergence
Kaito Ito
Kenji Kashima
71
1
0
04 Nov 2024
Mimicking Human Intuition: Cognitive Belief-Driven Reinforcement Learning
Xingrui Gu
Guanren Qiao
Chuyi Jiang
OffRL
96
0
0
02 Oct 2024
Model-Based Uncertainty in Value Functions
Carlos E. Luis
A. Bottero
Julia Vinogradska
Felix Berkenkamp
Jan Peters
115
15
0
24 Feb 2023
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
Haotian Ye
Xiaoyu Chen
Liwei Wang
S. Du
OffRL
86
7
0
19 Oct 2022
Age of Semantics in Cooperative Communications: To Expedite Simulation Towards Real via Offline Reinforcement Learning
Xianfu Chen
Zhifeng Zhao
S. Mao
Celimuge Wu
Honggang Zhang
M. Bennis
OffRL
83
3
0
19 Sep 2022
q-Learning in Continuous Time
Yanwei Jia
X. Zhou
OffRL
158
78
0
02 Jul 2022
Dynamic mean field programming
G. Stamatescu
53
0
0
10 Jun 2022
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Brian M. Sadler
Furong Huang
Pratap Tokekar
Tianyi Zhou
79
9
0
02 Jun 2022
SEREN: Knowing When to Explore and When to Exploit
Changmin Yu
D. Mguni
Dong Li
Aivar Sootla
Jun Wang
Neil Burgess
48
1
0
30 May 2022
Variational Bayesian Optimistic Sampling
Brendan O'Donoghue
Tor Lattimore
52
6
0
29 Oct 2021
DROMO: Distributionally Robust Offline Model-based Policy Optimization
Ruizhen Liu
Dazhi Zhong
Zhi-Cong Chen
OffRL
60
3
0
15 Sep 2021
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning
Kevin Wenliang Li
Abhishek Gupta
Ashwin Reddy
Vitchyr H. Pong
Aurick Zhou
Justin Yu
Sergey Levine
UQCV
71
31
0
15 Jul 2021
Reward is enough for convex MDPs
Tom Zahavy
Brendan O'Donoghue
Guillaume Desjardins
Satinder Singh
134
76
0
01 Jun 2021
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
126
70
0
06 Mar 2021
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
150
1,839
0
08 Jun 2020
Making Sense of Reinforcement Learning and Probabilistic Inference
Brendan O'Donoghue
Ian Osband
Catalin Ionescu
OffRL
111
49
0
03 Jan 2020
If MaxEnt RL is the Answer, What is the Question?
Benjamin Eysenbach
Sergey Levine
77
59
0
04 Oct 2019
Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Guy Lorberbom
Chris J. Maddison
N. Heess
Tamir Hazan
Daniel Tarlow
90
8
0
14 Jun 2019
1