ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.04412
  4. Cited By
Efficient Exploration through Bayesian Deep Q-Networks
v1v2v3v4 (latest)

Efficient Exploration through Bayesian Deep Q-Networks

13 February 2018
Kamyar Azizzadenesheli
Anima Anandkumar
    OffRLBDL
ArXiv (abs)PDFHTML

Papers citing "Efficient Exploration through Bayesian Deep Q-Networks"

50 / 101 papers shown
Enhancing Q-Value Updates in Deep Q-Learning via Successor-State Prediction
Enhancing Q-Value Updates in Deep Q-Learning via Successor-State Prediction
Lipeng Zu
Hansong Zhou
Xiaonan Zhang
121
0
0
05 Nov 2025
The Confusing Instance Principle for Online Linear Quadratic Control
The Confusing Instance Principle for Online Linear Quadratic Control
Waris Radji
Odalric-Ambrym Maillard
OffRL
180
1
0
22 Oct 2025
Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning
Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning
Pascal R. van der Vaart
Neil Yorke-Smith
M. Spaan
BDLUQCV
213
0
0
29 Aug 2025
Q-learning with Posterior Sampling
Q-learning with Posterior Sampling
Priyank Agrawal
Shipra Agrawal
Azmat Azati
OffRLGP
367
2
0
01 Jun 2025
Exploration-Driven Generative Interactive Environments
Exploration-Driven Generative Interactive EnvironmentsComputer Vision and Pattern Recognition (CVPR), 2025
N. Savov
Naser Kazemi
Mohammad Mahdi
Danda Pani Paudel
Xi Wang
Luc Van Gool
VGen3DV
311
7
0
03 Apr 2025
Look Before Leap: Look-Ahead Planning with Uncertainty in Reinforcement Learning
Look Before Leap: Look-Ahead Planning with Uncertainty in Reinforcement Learning
Yongshuai Liu
Xin Liu
413
3
0
26 Mar 2025
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning
Yexin Li
OffRL
479
2
0
23 Mar 2025
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement LearningAsian Conference on Machine Learning (ACML), 2025
Siddharth Aravindan
Dixant Mittal
Wee Sun Lee
BDL
326
0
0
17 Jan 2025
Spatial-Aware Decision-Making with Ring Attractors in Reinforcement Learning Systems
Spatial-Aware Decision-Making with Ring Attractors in Reinforcement Learning Systems
Marcos Negre Saura
Richard Allmendinger
Theodore Papamarkou
Wei Pan
1.0K
0
0
04 Oct 2024
Model-Free Active Exploration in Reinforcement Learning
Model-Free Active Exploration in Reinforcement Learning
Alessio Russo
Alexandre Proutiere
OffRL
394
6
0
30 Jun 2024
Sparse Bayesian Networks: Efficient Uncertainty Quantification in
  Medical Image Analysis
Sparse Bayesian Networks: Efficient Uncertainty Quantification in Medical Image Analysis
Zeinab Abboud
Herve Lombaert
Samuel Kadoury
UQCV
281
6
0
11 Jun 2024
Constrained Ensemble Exploration for Unsupervised Skill Discovery
Constrained Ensemble Exploration for Unsupervised Skill Discovery
Chenjia Bai
Rushuai Yang
Qiaosheng Zhang
Kang Xu
Yi Chen
Ting Xiao
Xuelong Li
OffRL
491
9
0
25 May 2024
Pessimistic Value Iteration for Multi-Task Data Sharing in Offline
  Reinforcement Learning
Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning
Chenjia Bai
Lingxiao Wang
Jianye Hao
Zhuoran Yang
Bin Zhao
Zhen Wang
Xuelong Li
OffRL
319
12
0
30 Apr 2024
Variational Bayesian Last Layers
Variational Bayesian Last Layers
James Harrison
John Willes
Jasper Snoek
BDLUQCV
440
72
0
17 Apr 2024
Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning
Srinjoy Roy
Swagatam Das
358
0
0
31 Mar 2024
A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC
  Orchestration
A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration
Fahri Wisnu Murti
Samad Ali
Matti Latva-aho
308
2
0
26 Dec 2023
Multi-Agent Probabilistic Ensembles with Trajectory Sampling for
  Connected Autonomous Vehicles
Multi-Agent Probabilistic Ensembles with Trajectory Sampling for Connected Autonomous Vehicles
Ruoqi Wen
Jiahao Huang
Rongpeng Li
Guoru Ding
Zhifeng Zhao
329
1
0
21 Dec 2023
On the Convergence and Sample Complexity Analysis of Deep Q-Networks
  with $ε$-Greedy Exploration
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with εεε-Greedy ExplorationNeural Information Processing Systems (NeurIPS), 2023
Shuai Zhang
Hongkang Li
Meng Wang
Miao Liu
Pin-Yu Chen
Songtao Lu
Sijia Liu
K. Murugesan
Subhajit Chaudhury
364
48
0
24 Oct 2023
Uncertainty-aware transfer across tasks using hybrid model-based
  successor feature reinforcement learning
Uncertainty-aware transfer across tasks using hybrid model-based successor feature reinforcement learning
Parvin Malekzadeh
Ming Hou
Konstantinos N. Plataniotis
357
3
0
16 Oct 2023
Uncertainty Quantification using Generative Approach
Uncertainty Quantification using Generative Approach
Yunsheng Zhang
UQCVBDL
133
0
0
13 Oct 2023
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive
  Advantages
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive AdvantagesInternational Conference on Machine Learning (ICML), 2023
Andrew Jesson
Chris Xiaoxuan Lu
Gunshi Gupta
Angelos Filos
Jakob N. Foerster
Y. Gal
OffRL
449
10
0
02 Jun 2023
Provable and Practical: Efficient Exploration in Reinforcement Learning
  via Langevin Monte Carlo
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte CarloInternational Conference on Learning Representations (ICLR), 2023
Haque Ishfaq
Qingfeng Lan
Pan Xu
A. R. Mahmood
Doina Precup
Anima Anandkumar
Kamyar Azizzadenesheli
BDLOffRL
430
33
0
29 May 2023
Posterior Sampling for Deep Reinforcement Learning
Posterior Sampling for Deep Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Remo Sasso
Michelangelo Conserva
Paulo E. Rauber
OffRLBDL
295
14
0
30 Apr 2023
Decision-Making Under Uncertainty: Beyond Probabilities
Decision-Making Under Uncertainty: Beyond ProbabilitiesInternational Journal on Software Tools for Technology Transfer (STTT) (STTT), 2023
Thom S. Badings
T. D. Simão
Marnix Suilen
N. Jansen
UDPER
316
18
0
10 Mar 2023
Exploration via Epistemic Value Estimation
Exploration via Epistemic Value EstimationAAAI Conference on Artificial Intelligence (AAAI), 2023
Simon Schmitt
John Shawe-Taylor
Hado van Hasselt
OffRL
197
4
0
07 Mar 2023
Learning How to Infer Partial MDPs for In-Context Adaptation and
  Exploration
Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration
Chentian Jiang
Nan Rosemary Ke
Hado van Hasselt
415
4
0
08 Feb 2023
The Role of Exploration for Task Transfer in Reinforcement Learning
The Role of Exploration for Task Transfer in Reinforcement Learning
Jonathan C. Balloch
Julia Kim
Jessica B. Langebrake Inman
Mark O. Riedl
OffRL
275
3
0
11 Oct 2022
POEM: Out-of-Distribution Detection with Posterior Sampling
POEM: Out-of-Distribution Detection with Posterior SamplingInternational Conference on Machine Learning (ICML), 2022
Yifei Ming
Ying Fan
Shouqing Yang
OODD
353
148
0
28 Jun 2022
SFP: State-free Priors for Exploration in Off-Policy Reinforcement
  Learning
SFP: State-free Priors for Exploration in Off-Policy Reinforcement Learning
Marco Bagatella
Sammy Christen
Otmar Hilliges
OffRL
483
6
0
26 May 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
From Dirichlet to Rubin: Optimistic Exploration in RL without BonusesInternational Conference on Machine Learning (ICML), 2022
D. Tiapkin
Denis Belomestny
Eric Moulines
A. Naumov
S. Samsonov
Yunhao Tang
Michal Valko
Pierre Menard
315
23
0
16 May 2022
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement
  Learning
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2022
Chenjia Bai
Lingxiao Wang
Zhuoran Yang
Zhihong Deng
Animesh Garg
Peng Liu
Zhaoran Wang
OffRL
370
168
0
23 Feb 2022
BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs
BADDr: Bayes-Adaptive Deep Dropout RL for POMDPsAdaptive Agents and Multi-Agent Systems (AAMAS), 2022
Sammie Katt
Hai V. Nguyen
F. Oliehoek
Chris Amato
BDLOffRL
141
2
0
17 Feb 2022
Fast online inference for nonlinear contextual bandit based on
  Generative Adversarial Network
Fast online inference for nonlinear contextual bandit based on Generative Adversarial Network
Yun-Da Tsai
Shou-De Lin
213
7
0
17 Feb 2022
Towards Interactive Reinforcement Learning with Intrinsic Feedback
Towards Interactive Reinforcement Learning with Intrinsic Feedback
Ben Poole
Minwoo Lee
OffRL
320
3
0
02 Dec 2021
Which Model to Trust: Assessing the Influence of Models on the
  Performance of Reinforcement Learning Algorithms for Continuous Control Tasks
Which Model to Trust: Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms for Continuous Control Tasks
Giacomo Arcieri
David Wölfle
Eleni Chatzi
OffRL
366
5
0
25 Oct 2021
Knowledge is reward: Learning optimal exploration by predictive reward
  cashing
Knowledge is reward: Learning optimal exploration by predictive reward cashing
Luca Ambrogioni
139
1
0
17 Sep 2021
DROMO: Distributionally Robust Offline Model-based Policy Optimization
DROMO: Distributionally Robust Offline Model-based Policy Optimization
Ruizhen Liu
Dazhi Zhong
Zhi-Cong Chen
OffRL
224
3
0
15 Sep 2021
Exploration in Deep Reinforcement Learning: From Single-Agent to
  Multiagent Domain
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
Jianye Hao
Zhenxing Ge
Hongyao Tang
Chenjia Bai
Jinyi Liu
Zhaopeng Meng
Peng Liu
Zhen Wang
OffRL
516
175
0
14 Sep 2021
A Survey of Exploration Methods in Reinforcement Learning
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
408
106
0
01 Sep 2021
Analytically Tractable Bayesian Deep Q-Learning
Analytically Tractable Bayesian Deep Q-Learning
Luong Ha
L. Nguyen
J. Goulet
BDLOffRL
175
2
0
21 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
216
20
0
15 Jun 2021
Reinforced Few-Shot Acquisition Function Learning for Bayesian
  Optimization
Reinforced Few-Shot Acquisition Function Learning for Bayesian OptimizationNeural Information Processing Systems (NeurIPS), 2021
Bing-Jing Hsieh
Ping-Chun Hsieh
Xi Liu
314
26
0
08 Jun 2021
Mitigating Covariate Shift in Imitation Learning via Offline Data
  Without Great Coverage
Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Jonathan D. Chang
Masatoshi Uehara
Dhruv Sreenivas
Rahul Kidambi
Wen Sun
OffRL
382
37
0
06 Jun 2021
Multi-facet Contextual Bandits: A Neural Network Perspective
Multi-facet Contextual Bandits: A Neural Network PerspectiveKnowledge Discovery and Data Mining (KDD), 2021
Yikun Ban
Jingrui He
C. Cook
453
31
0
06 Jun 2021
Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs
  with a Generative Model
Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative ModelNeural Information Processing Systems (NeurIPS), 2021
Bingyan Wang
Yuling Yan
Jianqing Fan
507
24
0
28 May 2021
Principled Exploration via Optimistic Bootstrapping and Backward
  Induction
Principled Exploration via Optimistic Bootstrapping and Backward InductionInternational Conference on Machine Learning (ICML), 2021
Chenjia Bai
Lingxiao Wang
Lei Han
Jianye Hao
Animesh Garg
Peng Liu
Zhaoran Wang
OffRL
242
46
0
13 May 2021
Meta-Learning-Based Robust Adaptive Flight Control Under Uncertain Wind
  Conditions
Meta-Learning-Based Robust Adaptive Flight Control Under Uncertain Wind Conditions
Michael O'Connell
Guanya Shi
Xichen Shi
Soon-Jo Chung
228
27
0
02 Mar 2021
MobILE: Model-Based Imitation Learning From Observation Alone
MobILE: Model-Based Imitation Learning From Observation AloneNeural Information Processing Systems (NeurIPS), 2021
Rahul Kidambi
Jonathan D. Chang
Wen Sun
308
46
0
22 Feb 2021
Output-Weighted Sampling for Multi-Armed Bandits with Extreme Payoffs
Output-Weighted Sampling for Multi-Armed Bandits with Extreme PayoffsProceedings of the Royal Society A (Proc. R. Soc. A), 2021
Jianlong Wu
A. Blanchard
T. Sapsis
P. Perdikaris
253
21
0
19 Feb 2021
COMBO: Conservative Offline Model-Based Policy Optimization
COMBO: Conservative Offline Model-Based Policy OptimizationNeural Information Processing Systems (NeurIPS), 2021
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
768
510
0
16 Feb 2021
123
Next
Page 1 of 3