Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.04020
Cited By
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning
12 February 2018
Ronan Fruit
Matteo Pirotta
A. Lazaric
R. Ortner
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning"
29 / 29 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
75
2
0
10 Oct 2024
Optimistic Q-learning for average reward and episodic reinforcement learning
Priyank Agrawal
Shipra Agrawal
56
3
0
18 Jul 2024
Dealing with unbounded gradients in stochastic saddle-point optimization
Gergely Neu
Nneka Okolo
37
3
0
21 Feb 2024
Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes
Qinbo Bai
Washim Uddin Mondal
Vaneet Aggarwal
34
9
0
05 Sep 2023
Learning Optimal Admission Control in Partially Observable Queueing Networks
Jonatha Anselmi
B. Gaujal
Louis-Sébastien Rebuffi
26
1
0
04 Aug 2023
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
21
0
25 Jul 2023
Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space
Jonatha Anselmi
B. Gaujal
Louis-Sébastien Rebuffi
27
2
0
21 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
Uri Sherman
Tomer Koren
Yishay Mansour
32
12
0
30 Jan 2023
Logarithmic regret bounds for continuous-time average-reward Markov decision processes
Xuefeng Gao
X. Zhou
36
8
0
23 May 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
D. Tiapkin
Denis Belomestny
Eric Moulines
A. Naumov
S. Samsonov
Yunhao Tang
Michal Valko
Pierre Menard
28
16
0
16 May 2022
Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints
Liyu Chen
R. Jain
Haipeng Luo
57
25
0
31 Jan 2022
Bad-Policy Density: A Measure of Reinforcement Learning Hardness
David Abel
Cameron Allen
Dilip Arumugam
D Ellis Hershkowitz
Michael L. Littman
Lawson L. S. Wong
26
2
0
07 Oct 2021
Understanding Domain Randomization for Sim-to-real Transfer
Xiaoyu Chen
Jiachen Hu
Chi Jin
Lihong Li
Liwei Wang
24
112
0
07 Oct 2021
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Mridul Agarwal
Qinbo Bai
Vaneet Aggarwal
36
12
0
12 Sep 2021
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
21
80
0
01 Sep 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
40
0
01 Mar 2021
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
31
20
0
25 Feb 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
35
32
0
29 Dec 2020
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
6
93
0
24 Jun 2020
Tightening Exploration in Upper Confidence Reinforcement Learning
Hippolyte Bourel
Odalric-Ambrym Maillard
M. S. Talebi
22
31
0
20 Apr 2020
Adaptive Approximate Policy Iteration
Botao Hao
N. Lazić
Yasin Abbasi-Yadkori
Pooria Joulani
Csaba Szepesvári
18
14
0
08 Feb 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
107
99
0
15 Oct 2019
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
13
71
0
12 Jun 2019
Non-Stationary Reinforcement Learning: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
23
7
0
07 Jun 2019
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
Yonathan Efroni
Nadav Merlis
Mohammad Ghavamzadeh
Shie Mannor
OffRL
22
68
0
27 May 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
21
272
0
01 Jan 2019
Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes
Jian Qian
Ronan Fruit
Matteo Pirotta
A. Lazaric
6
10
0
11 Dec 2018
Regret Bounds for Reinforcement Learning via Markov Chain Concentration
R. Ortner
22
46
0
06 Aug 2018
1