ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.09647
  4. Cited By
Variational Bayesian Reinforcement Learning with Regret Bounds
v1v2v3v4 (latest)

Variational Bayesian Reinforcement Learning with Regret Bounds

25 July 2018
Brendan O'Donoghue
ArXiv (abs)PDFHTML

Papers citing "Variational Bayesian Reinforcement Learning with Regret Bounds"

20 / 20 papers shown
Title
IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic
IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic
Stefano Viel
Luca Viano
Volkan Cevher
203
1
0
27 Feb 2025
Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games
Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games
Tong Yang
Bo Dai
Lin Xiao
Yuejie Chi
OffRL
138
2
0
13 Feb 2025
Risk-sensitive control as inference with Rényi divergence
Risk-sensitive control as inference with Rényi divergence
Kaito Ito
Kenji Kashima
71
1
0
04 Nov 2024
Mimicking Human Intuition: Cognitive Belief-Driven Reinforcement Learning
Mimicking Human Intuition: Cognitive Belief-Driven Reinforcement Learning
Xingrui Gu
Guanren Qiao
Chuyi Jiang
OffRL
96
0
0
02 Oct 2024
Model-Based Uncertainty in Value Functions
Model-Based Uncertainty in Value Functions
Carlos E. Luis
A. Bottero
Julia Vinogradska
Felix Berkenkamp
Jan Peters
115
15
0
24 Feb 2023
On the Power of Pre-training for Generalization in RL: Provable Benefits
  and Hardness
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
Haotian Ye
Xiaoyu Chen
Liwei Wang
S. Du
OffRL
86
7
0
19 Oct 2022
Age of Semantics in Cooperative Communications: To Expedite Simulation
  Towards Real via Offline Reinforcement Learning
Age of Semantics in Cooperative Communications: To Expedite Simulation Towards Real via Offline Reinforcement Learning
Xianfu Chen
Zhifeng Zhao
S. Mao
Celimuge Wu
Honggang Zhang
M. Bennis
OffRL
83
3
0
19 Sep 2022
q-Learning in Continuous Time
q-Learning in Continuous Time
Yanwei Jia
X. Zhou
OffRL
158
78
0
02 Jul 2022
Dynamic mean field programming
Dynamic mean field programming
G. Stamatescu
53
0
0
10 Jun 2022
Posterior Coreset Construction with Kernelized Stein Discrepancy for
  Model-Based Reinforcement Learning
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Brian M. Sadler
Furong Huang
Pratap Tokekar
Tianyi Zhou
79
9
0
02 Jun 2022
SEREN: Knowing When to Explore and When to Exploit
SEREN: Knowing When to Explore and When to Exploit
Changmin Yu
D. Mguni
Dong Li
Aivar Sootla
Jun Wang
Neil Burgess
48
1
0
30 May 2022
Variational Bayesian Optimistic Sampling
Variational Bayesian Optimistic Sampling
Brendan O'Donoghue
Tor Lattimore
52
6
0
29 Oct 2021
DROMO: Distributionally Robust Offline Model-based Policy Optimization
DROMO: Distributionally Robust Offline Model-based Policy Optimization
Ruizhen Liu
Dazhi Zhong
Zhi-Cong Chen
OffRL
60
3
0
15 Sep 2021
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
  Reinforcement Learning
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning
Kevin Wenliang Li
Abhishek Gupta
Ashwin Reddy
Vitchyr H. Pong
Aurick Zhou
Justin Yu
Sergey Levine
UQCV
71
31
0
15 Jul 2021
Reward is enough for convex MDPs
Reward is enough for convex MDPs
Tom Zahavy
Brendan O'Donoghue
Guillaume Desjardins
Satinder Singh
134
76
0
01 Jun 2021
Reinforcement Learning, Bit by Bit
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
126
70
0
06 Mar 2021
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRLOnRL
150
1,839
0
08 Jun 2020
Making Sense of Reinforcement Learning and Probabilistic Inference
Making Sense of Reinforcement Learning and Probabilistic Inference
Brendan O'Donoghue
Ian Osband
Catalin Ionescu
OffRL
111
49
0
03 Jan 2020
If MaxEnt RL is the Answer, What is the Question?
If MaxEnt RL is the Answer, What is the Question?
Benjamin Eysenbach
Sergey Levine
77
59
0
04 Oct 2019
Direct Policy Gradients: Direct Optimization of Policies in Discrete
  Action Spaces
Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Guy Lorberbom
Chris J. Maddison
N. Heess
Tamir Hazan
Daniel Tarlow
90
8
0
14 Jun 2019
1