ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.12174
  4. Cited By
Optimistic Exploration even with a Pessimistic Initialisation

Optimistic Exploration even with a Pessimistic Initialisation

International Conference on Learning Representations (ICLR), 2020
26 February 2020
Tabish Rashid
Bei Peng
Wendelin Bohmer
Shimon Whiteson
    OffRLOnRL
ArXiv (abs)PDFHTML

Papers citing "Optimistic Exploration even with a Pessimistic Initialisation"

27 / 27 papers shown
Count Counts: Motivating Exploration in LLM Reasoning with Count-based Intrinsic Rewards
Count Counts: Motivating Exploration in LLM Reasoning with Count-based Intrinsic Rewards
Xuan Zhang
Ruixiao Li
Zhijian Zhou
Long Li
Yulei Qin
Ke Li
Xing Sun
Xiaoyu Tan
Chao Qu
Yuan Qi
LRM
203
0
0
18 Oct 2025
Universal Value-Function Uncertainties
Universal Value-Function Uncertainties
Moritz A. Zanger
Max Weltevrede
Yaniv Oren
Pascal R. van der Vaart
Caroline Horsch
Wendelin Bohmer
M. Spaan
OffRL
289
0
0
27 May 2025
Exploration by Random Distribution Distillation
Exploration by Random Distribution Distillation
Zhirui Fang
Kai Yang
Jian Tao
Jiafei Lyu
Lusong Li
Li Shen
Xiu Li
328
1
0
16 May 2025
Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification
Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification
Rudolf Reiter
Jasper Hoffmann
D. Reinhardt
Florian Messerer
Katrin Baumgärtner
Shamburaj Sawant
Joschka Boedecker
Moritz Diehl
S. Gros
323
20
0
04 Feb 2025
Iterative Preference Learning from Human Feedback: Bridging Theory and
  Practice for RLHF under KL-Constraint
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint
Wei Xiong
Hanze Dong
Chen Ye
Ziqi Wang
Han Zhong
Heng Ji
Nan Jiang
Tong Zhang
OffRL
374
295
0
18 Dec 2023
Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate
  Exploration Bias
Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias
Max Sobol Mark
Archit Sharma
Fahim Tajwar
Rafael Rafailov
Sergey Levine
Chelsea Finn
OffRLOnRL
307
4
0
12 Oct 2023
Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement
  Learning
Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Sam Lobel
Akhil Bagaria
George Konidaris
236
28
0
05 Jun 2023
Posterior Sampling for Deep Reinforcement Learning
Posterior Sampling for Deep Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Remo Sasso
Michelangelo Conserva
Paulo E. Rauber
OffRLBDL
231
12
0
30 Apr 2023
Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent
  Reinforcement Learning
Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement LearningConference on Uncertainty in Artificial Intelligence (UAI), 2023
Xutong Zhao
Yangchen Pan
Chenjun Xiao
Sarath Chandar
Janarthanan Rajendran
277
9
0
16 Mar 2023
Pretraining in Deep Reinforcement Learning: A Survey
Pretraining in Deep Reinforcement Learning: A Survey
Zhihui Xie
Zichuan Lin
Junyou Li
Shuai Li
Deheng Ye
OffRLOnRLAI4CE
229
30
0
08 Nov 2022
Optimal Conservative Offline RL with General Function Approximation via
  Augmented Lagrangian
Optimal Conservative Offline RL with General Function Approximation via Augmented LagrangianInternational Conference on Learning Representations (ICLR), 2022
Paria Rashidinejad
Hanlin Zhu
Kunhe Yang
Stuart J. Russell
Jiantao Jiao
OffRL
372
33
0
01 Nov 2022
Optimistic Curiosity Exploration and Conservative Exploitation with
  Linear Reward Shaping
Optimistic Curiosity Exploration and Conservative Exploitation with Linear Reward Shaping
Hao Sun
Lei Han
Rui Yang
Xiaoteng Ma
Jian Guo
Bolei Zhou
OffRLOnRL
178
12
0
15 Sep 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for
  Discounted MDPs
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPsInternational Conference on Algorithmic Learning Theory (ALT), 2022
Ian A. Kash
L. Reyzin
Zishun Yu
361
0
0
18 May 2022
Learning to Act with Affordance-Aware Multimodal Neural SLAM
Learning to Act with Affordance-Aware Multimodal Neural SLAMIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Zhiwei Jia
Kaixiang Lin
Yizhou Zhao
Qiaozi Gao
Govind Thattai
Gaurav Sukhatme
LM&Ro
239
16
0
24 Jan 2022
Dealing with the Unknown: Pessimistic Offline Reinforcement Learning
Dealing with the Unknown: Pessimistic Offline Reinforcement LearningConference on Robot Learning (CoRL), 2021
Jinning Li
Chen Tang
Masayoshi Tomizuka
Wei Zhan
OffRL
269
24
0
09 Nov 2021
Dynamic Bottleneck for Robust Self-Supervised Exploration
Dynamic Bottleneck for Robust Self-Supervised ExplorationNeural Information Processing Systems (NeurIPS), 2021
Chenjia Bai
Lingxiao Wang
Lei Han
Animesh Garg
Jianye Hao
Peng Liu
Zhaoran Wang
134
35
0
20 Oct 2021
Balancing Value Underestimation and Overestimation with Realistic
  Actor-Critic
Balancing Value Underestimation and Overestimation with Realistic Actor-Critic
Sicen Li
Qinyun Tang
G. Wang
Xinmeng Ma
Li-quan Wang
OffRL
337
4
0
19 Oct 2021
A Survey of Exploration Methods in Reinforcement Learning
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
301
99
0
01 Sep 2021
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Cooperative Exploration for Multi-Agent Deep Reinforcement LearningInternational Conference on Machine Learning (ICML), 2021
Iou-Jen Liu
Unnat Jain
Raymond A. Yeh
Alex Schwing
269
125
0
23 Jul 2021
Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated
  Exploration
Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated ExplorationAdaptive Agents and Multi-Agent Systems (AAMAS), 2021
Lukas Schafer
Filippos Christianos
Josiah P. Hanna
Stefano V. Albrecht
226
24
0
19 Jul 2021
Optimistic Reinforcement Learning by Forward Kullback-Leibler Divergence
  Optimization
Optimistic Reinforcement Learning by Forward Kullback-Leibler Divergence OptimizationNeural Networks (NN), 2021
Taisuke Kobayashi
196
18
0
27 May 2021
Principled Exploration via Optimistic Bootstrapping and Backward
  Induction
Principled Exploration via Optimistic Bootstrapping and Backward InductionInternational Conference on Machine Learning (ICML), 2021
Chenjia Bai
Lingxiao Wang
Lei Han
Jianye Hao
Animesh Garg
Peng Liu
Zhaoran Wang
OffRL
197
45
0
13 May 2021
An Open-Source Multi-Goal Reinforcement Learning Environment for Robotic
  Manipulation with Pybullet
An Open-Source Multi-Goal Reinforcement Learning Environment for Robotic Manipulation with PybulletTowards Autonomous Robotic Systems (TAROS), 2021
Xintong Yang
Ze Ji
Jing Wu
Yu-kun Lai
120
21
0
12 May 2021
No-Regret Reinforcement Learning with Heavy-Tailed Rewards
No-Regret Reinforcement Learning with Heavy-Tailed RewardsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Vincent Zhuang
Yanan Sui
838
12
0
25 Feb 2021
Decoupled Exploration and Exploitation Policies for Sample-Efficient
  Reinforcement Learning
Decoupled Exploration and Exploitation Policies for Sample-Efficient Reinforcement Learning
William F. Whitney
Michael Bloesch
Jost Tobias Springenberg
A. Abdolmaleki
Dong Wang
Martin Riedmiller
OffRL
240
17
0
23 Jan 2021
Variational Dynamic for Self-Supervised Exploration in Deep
  Reinforcement Learning
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement LearningIEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2020
Chenjia Bai
Peng Liu
Kaiyu Liu
Zhaoran Wang
Yingnan Zhao
Lingxiao Wang
SSL
242
21
0
17 Oct 2020
Towards Tractable Optimism in Model-Based Reinforcement Learning
Towards Tractable Optimism in Model-Based Reinforcement Learning
Aldo Pacchiano
Philip J. Ball
Jack Parker-Holder
K. Choromanski
Stephen J. Roberts
OffRL
149
12
0
21 Jun 2020
1
Page 1 of 1