Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.07608
Cited By
Deep Exploration via Randomized Value Functions
22 March 2017
Ian Osband
Benjamin Van Roy
Daniel Russo
Zheng Wen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Exploration via Randomized Value Functions"
50 / 74 papers shown
Title
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
92
1
0
29 Apr 2025
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model
Moritz A. Zanger
Pascal R. van der Vaart
Wendelin Bohmer
M. Spaan
UQCV
BDL
149
0
0
14 Mar 2025
Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation
Taehyun Cho
Seung Han
Kyungjae Lee
Seokhun Ju
Dohyeong Kim
Jungwoo Lee
64
0
0
31 Jul 2024
Random Latent Exploration for Deep Reinforcement Learning
Srinath Mahankali
Zhang-Wei Hong
Ayush Sekhari
Alexander Rakhlin
Pulkit Agrawal
33
3
0
18 Jul 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
37
2
0
13 Jun 2024
A social path to human-like artificial intelligence
Edgar A. Duénez-Guzmán
Suzanne Sadedin
Jane X. Wang
Kevin R. McKee
Joel Z Leibo
GNN
31
28
0
22 May 2024
Goal Exploration via Adaptive Skill Distribution for Goal-Conditioned Reinforcement Learning
Lisheng Wu
Ke Chen
26
0
0
19 Apr 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li
Jiawei Xu
Lei Han
Zhi-Quan Luo
BDL
OffRL
26
6
0
05 Feb 2024
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
Zhiyuan Zhao
Bin Wang
Linke Ouyang
Xiao-wen Dong
Jiaqi Wang
Conghui He
MLLM
VLM
32
106
0
28 Nov 2023
Ensemble sampling for linear bandits: small ensembles suffice
David Janz
A. Litvak
Csaba Szepesvári
30
2
0
14 Nov 2023
Bag of Policies for Distributional Deep Exploration
Asen Nachkov
Luchen Li
Giulia Luise
Filippo Valdettaro
Aldo A. Faisal
OffRL
37
0
0
03 Aug 2023
Diverse Projection Ensembles for Distributional Reinforcement Learning
Moritz A. Zanger
Wendelin Bohmer
M. Spaan
25
4
0
12 Jun 2023
A Cover Time Study of a non-Markovian Algorithm
Guanhua Fang
G. Samorodnitsky
Zhiqiang Xu
18
0
0
08 Jun 2023
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo
Haque Ishfaq
Qingfeng Lan
Pan Xu
A. R. Mahmood
Doina Precup
Anima Anandkumar
Kamyar Azizzadenesheli
BDL
OffRL
28
20
0
29 May 2023
Bayesian Reinforcement Learning with Limited Cognitive Load
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
OffRL
34
8
0
05 May 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
Thanh Nguyen-Tang
R. Arora
OffRL
46
5
0
24 Feb 2023
Model-Based Uncertainty in Value Functions
Carlos E. Luis
A. Bottero
Julia Vinogradska
Felix Berkenkamp
Jan Peters
33
13
0
24 Feb 2023
Multiplier Bootstrap-based Exploration
Runzhe Wan
Haoyu Wei
B. Kveton
R. Song
16
2
0
03 Feb 2023
Selective Uncertainty Propagation in Offline RL
Sanath Kumar Krishnamurthy
Shrey Modi
Tanmay Gangwani
S. Katariya
B. Kveton
A. Rangi
OffRL
61
0
0
01 Feb 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Mengdi Wang
Furong Huang
Dinesh Manocha
24
7
0
28 Jan 2023
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
28
4
0
30 Oct 2022
Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction
Dilip Arumugam
Satinder Singh
24
3
0
30 Oct 2022
Hardness in Markov Decision Processes: Theory and Practice
Michelangelo Conserva
Paulo E. Rauber
24
3
0
24 Oct 2022
Learning General World Models in a Handful of Reward-Free Deployments
Yingchen Xu
Jack Parker-Holder
Aldo Pacchiano
Philip J. Ball
Oleh Rybkin
Stephen J. Roberts
Tim Rocktaschel
Edward Grefenstette
OffRL
55
8
0
23 Oct 2022
Bayesian Q-learning With Imperfect Expert Demonstrations
Fengdi Che
Xiru Zhu
Doina Precup
D. Meger
Gregory Dudek
19
2
0
01 Oct 2022
q-Learning in Continuous Time
Yanwei Jia
X. Zhou
OffRL
48
67
0
02 Jul 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
D. Tiapkin
Denis Belomestny
Eric Moulines
A. Naumov
S. Samsonov
Yunhao Tang
Michal Valko
Pierre Menard
24
16
0
16 May 2022
Exploration in Deep Reinforcement Learning: A Survey
Pawel Ladosz
Lilian Weng
Minwoo Kim
H. Oh
OffRL
23
323
0
02 May 2022
An Analysis of Ensemble Sampling
Chao Qin
Zheng Wen
Xiuyuan Lu
Benjamin Van Roy
29
22
0
02 Mar 2022
Learning Robust Real-Time Cultural Transmission without Human Data
Cultural General Intelligence Team
Avishkar Bhoopchand
Bethanie Brownfield
Adrian Collister
Agustin Dal Lago
...
Alex Platonov
Evan Senter
Sukhdeep Singh
Alexander Zacherl
Lei M. Zhang
VLM
40
11
0
01 Mar 2022
Interesting Object, Curious Agent: Learning Task-Agnostic Exploration
Simone Parisi
Victoria Dean
Deepak Pathak
Abhinav Gupta
LM&Ro
38
50
0
25 Nov 2021
Learning to Be Cautious
Montaser Mohammedalamen
Dustin Morrill
Alexander Sieusahai
Yash Satsangi
Michael Bowling
18
3
0
29 Oct 2021
The Value of Information When Deciding What to Learn
Dilip Arumugam
Benjamin Van Roy
34
12
0
26 Oct 2021
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning
Tong Zhang
22
63
0
02 Oct 2021
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Romain Laroche
Rémi Tachet des Combes
40
8
0
29 Sep 2021
Deep Exploration for Recommendation Systems
Zheqing Zhu
Benjamin Van Roy
32
11
0
26 Sep 2021
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
Jianye Hao
Tianpei Yang
Hongyao Tang
Chenjia Bai
Jinyi Liu
Zhaopeng Meng
Peng Liu
Zhen Wang
OffRL
36
92
0
14 Sep 2021
Strategically Efficient Exploration in Competitive Multi-agent Reinforcement Learning
R. Loftin
Aadirupa Saha
Sam Devlin
Katja Hofmann
30
5
0
30 Jul 2021
Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics
Krishan Rana
Vibhavari Dasagi
Jesse Haviland
Ben Talbot
Michael Milford
Niko Sünderhauf
BDL
OffRL
19
31
0
21 Jul 2021
Applications of the Free Energy Principle to Machine Learning and Neuroscience
Beren Millidge
DRL
20
7
0
30 Jun 2021
Repulsive Deep Ensembles are Bayesian
Francesco DÁngelo
Vincent Fortuin
UQCV
BDL
51
93
0
22 Jun 2021
Randomized Exploration for Reinforcement Learning with General Value Function Approximation
Haque Ishfaq
Qiwen Cui
V. Nguyen
Alex Ayoub
Zhuoran Yang
Zhaoran Wang
Doina Precup
Lin F. Yang
24
43
0
15 Jun 2021
Bayesian Bellman Operators
M. Fellows
Kristian Hartikainen
Shimon Whiteson
OffRL
37
15
0
09 Jun 2021
Priors in Bayesian Deep Learning: A Review
Vincent Fortuin
UQCV
BDL
31
124
0
14 May 2021
An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning
Dilip Arumugam
Peter Henderson
Pierre-Luc Bacon
22
17
0
10 Mar 2021
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
27
70
0
06 Mar 2021
Discovery of Options via Meta-Learned Subgoals
Vivek Veeriah
Tom Zahavy
Matteo Hessel
Zhongwen Xu
Junhyuk Oh
Iurii Kemaev
H. V. Hasselt
David Silver
Satinder Singh
21
33
0
12 Feb 2021
Decoupled Exploration and Exploitation Policies for Sample-Efficient Reinforcement Learning
William F. Whitney
Michael Bloesch
Jost Tobias Springenberg
A. Abdolmaleki
Kyunghyun Cho
Martin Riedmiller
OffRL
19
13
0
23 Jan 2021
BeBold: Exploration Beyond the Boundary of Explored Regions
Tianjun Zhang
Huazhe Xu
Xiaolong Wang
Yi Wu
Kurt Keutzer
Joseph E. Gonzalez
Yuandong Tian
28
40
0
15 Dec 2020
Policy Optimization as Online Learning with Mediator Feedback
Alberto Maria Metelli
Matteo Papini
P. DÓro
Marcello Restelli
OffRL
19
10
0
15 Dec 2020
1
2
Next