Deep Exploration via Randomized Value Functions

22 March 2017

Papers citing "Deep Exploration via Randomized Value Functions"

50 / 76 papers shown

Title
Toward Efficient Exploration by Large Language Model Agents Dilip Arumugam Thomas L. Griffiths LLMAG 92 1 0 29 Apr 2025
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model Moritz A. Zanger Pascal R. van der Vaart Wendelin Bohmer M. Spaan UQCV BDL 149 0 0 14 Mar 2025
Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation Taehyun Cho Seung Han Kyungjae Lee Seokhun Ju Dohyeong Kim Jungwoo Lee 64 0 0 31 Jul 2024
Random Latent Exploration for Deep Reinforcement Learning Srinath Mahankali Zhang-Wei Hong Ayush Sekhari Alexander Rakhlin Pulkit Agrawal 33 3 0 18 Jul 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF Akhil Agnihotri Rahul Jain Deepak Ramachandran Zheng Wen OffRL 37 2 0 13 Jun 2024
A social path to human-like artificial intelligence Edgar A. Duénez-Guzmán Suzanne Sadedin Jane X. Wang Kevin R. McKee Joel Z Leibo GNN 31 28 0 22 May 2024
Goal Exploration via Adaptive Skill Distribution for Goal-Conditioned Reinforcement Learning Lisheng Wu Ke Chen 26 0 0 19 Apr 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent Yingru Li Jiawei Xu Lei Han Zhi-Quan Luo BDL OffRL 26 6 0 05 Feb 2024
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization Zhiyuan Zhao Bin Wang Linke Ouyang Xiao-wen Dong Jiaqi Wang Conghui He MLLM VLM 32 106 0 28 Nov 2023
Ensemble sampling for linear bandits: small ensembles suffice David Janz A. Litvak Csaba Szepesvári 30 2 0 14 Nov 2023
Bag of Policies for Distributional Deep Exploration Asen Nachkov Luchen Li Giulia Luise Filippo Valdettaro Aldo A. Faisal OffRL 40 0 0 03 Aug 2023
Diverse Projection Ensembles for Distributional Reinforcement Learning Moritz A. Zanger Wendelin Bohmer M. Spaan 25 4 0 12 Jun 2023
A Cover Time Study of a non-Markovian Algorithm Guanhua Fang G. Samorodnitsky Zhiqiang Xu 18 0 0 08 Jun 2023
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo Haque Ishfaq Qingfeng Lan Pan Xu A. R. Mahmood Doina Precup Anima Anandkumar Kamyar Azizzadenesheli BDL OffRL 28 20 0 29 May 2023
Bayesian Reinforcement Learning with Limited Cognitive Load Dilip Arumugam Mark K. Ho Noah D. Goodman Benjamin Van Roy OffRL 34 8 0 05 May 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation Thanh Nguyen-Tang R. Arora OffRL 46 5 0 24 Feb 2023
Model-Based Uncertainty in Value Functions Carlos E. Luis A. Bottero Julia Vinogradska Felix Berkenkamp Jan Peters 33 13 0 24 Feb 2023
Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration Chentian Jiang Nan Rosemary Ke Hado van Hasselt 16 3 0 08 Feb 2023
Multiplier Bootstrap-based Exploration Runzhe Wan Haoyu Wei B. Kveton R. Song 16 2 0 03 Feb 2023
Selective Uncertainty Propagation in Offline RL Sanath Kumar Krishnamurthy Shrey Modi Tanmay Gangwani S. Katariya B. Kveton A. Rangi OffRL 61 0 0 01 Feb 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning Souradip Chakraborty Amrit Singh Bedi Alec Koppel Mengdi Wang Furong Huang Dinesh Manocha 24 7 0 28 Jan 2023
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning Dilip Arumugam Mark K. Ho Noah D. Goodman Benjamin Van Roy 28 4 0 30 Oct 2022
Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction Dilip Arumugam Satinder Singh 24 3 0 30 Oct 2022
Hardness in Markov Decision Processes: Theory and Practice Michelangelo Conserva Paulo E. Rauber 26 3 0 24 Oct 2022
Learning General World Models in a Handful of Reward-Free Deployments Yingchen Xu Jack Parker-Holder Aldo Pacchiano Philip J. Ball Oleh Rybkin Stephen J. Roberts Tim Rocktaschel Edward Grefenstette OffRL 55 8 0 23 Oct 2022
Bayesian Q-learning With Imperfect Expert Demonstrations Fengdi Che Xiru Zhu Doina Precup D. Meger Gregory Dudek 19 2 0 01 Oct 2022
q-Learning in Continuous Time Yanwei Jia X. Zhou OffRL 48 67 0 02 Jul 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses D. Tiapkin Denis Belomestny Eric Moulines A. Naumov S. Samsonov Yunhao Tang Michal Valko Pierre Menard 24 16 0 16 May 2022
Exploration in Deep Reinforcement Learning: A Survey Pawel Ladosz Lilian Weng Minwoo Kim H. Oh OffRL 23 324 0 02 May 2022
An Analysis of Ensemble Sampling Chao Qin Zheng Wen Xiuyuan Lu Benjamin Van Roy 29 22 0 02 Mar 2022
Learning Robust Real-Time Cultural Transmission without Human Data Cultural General Intelligence Team Avishkar Bhoopchand Bethanie Brownfield Adrian Collister Agustin Dal Lago ... Alex Platonov Evan Senter Sukhdeep Singh Alexander Zacherl Lei M. Zhang VLM 42 11 0 01 Mar 2022
Exponential Family Model-Based Reinforcement Learning via Score Matching Gen Li Junbo Li Anmol Kabra Nathan Srebro Zhaoran Wang Zhuoran Yang 22 4 0 28 Dec 2021
Interesting Object, Curious Agent: Learning Task-Agnostic Exploration Simone Parisi Victoria Dean Deepak Pathak Abhinav Gupta LM&Ro 38 50 0 25 Nov 2021
Learning to Be Cautious Montaser Mohammedalamen Dustin Morrill Alexander Sieusahai Yash Satsangi Michael Bowling 18 3 0 29 Oct 2021
The Value of Information When Deciding What to Learn Dilip Arumugam Benjamin Van Roy 34 12 0 26 Oct 2021
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning Tong Zhang 22 63 0 02 Oct 2021
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates Romain Laroche Rémi Tachet des Combes 40 8 0 29 Sep 2021
Deep Exploration for Recommendation Systems Zheqing Zhu Benjamin Van Roy 32 11 0 26 Sep 2021
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain Jianye Hao Tianpei Yang Hongyao Tang Chenjia Bai Jinyi Liu Zhaopeng Meng Peng Liu Zhen Wang OffRL 36 92 0 14 Sep 2021
Strategically Efficient Exploration in Competitive Multi-agent Reinforcement Learning R. Loftin Aadirupa Saha Sam Devlin Katja Hofmann 30 5 0 30 Jul 2021
Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics Krishan Rana Vibhavari Dasagi Jesse Haviland Ben Talbot Michael Milford Niko Sünderhauf BDL OffRL 19 31 0 21 Jul 2021
Applications of the Free Energy Principle to Machine Learning and Neuroscience Beren Millidge DRL 20 7 0 30 Jun 2021
Repulsive Deep Ensembles are Bayesian Francesco DÁngelo Vincent Fortuin UQCV BDL 51 93 0 22 Jun 2021
Randomized Exploration for Reinforcement Learning with General Value Function Approximation Haque Ishfaq Qiwen Cui V. Nguyen Alex Ayoub Zhuoran Yang Zhaoran Wang Doina Precup Lin F. Yang 26 43 0 15 Jun 2021
Bayesian Bellman Operators M. Fellows Kristian Hartikainen Shimon Whiteson OffRL 37 15 0 09 Jun 2021
Priors in Bayesian Deep Learning: A Review Vincent Fortuin UQCV BDL 31 124 0 14 May 2021
An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning Dilip Arumugam Peter Henderson Pierre-Luc Bacon 22 17 0 10 Mar 2021
Reinforcement Learning, Bit by Bit Xiuyuan Lu Benjamin Van Roy Vikranth Dwaracherla M. Ibrahimi Ian Osband Zheng Wen 30 70 0 06 Mar 2021
Discovery of Options via Meta-Learned Subgoals Vivek Veeriah Tom Zahavy Matteo Hessel Zhongwen Xu Junhyuk Oh Iurii Kemaev H. V. Hasselt David Silver Satinder Singh 21 33 0 12 Feb 2021
Decoupled Exploration and Exploitation Policies for Sample-Efficient Reinforcement Learning William F. Whitney Michael Bloesch Jost Tobias Springenberg A. Abdolmaleki Kyunghyun Cho Martin Riedmiller OffRL 21 13 0 23 Jan 2021