ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.04020
  4. Cited By
Efficient Bias-Span-Constrained Exploration-Exploitation in
  Reinforcement Learning

Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning

12 February 2018
Ronan Fruit
Matteo Pirotta
A. Lazaric
R. Ortner
ArXivPDFHTML

Papers citing "Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning"

32 / 32 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
Optimistic Q-learning for average reward and episodic reinforcement learning
Optimistic Q-learning for average reward and episodic reinforcement learning
Priyank Agrawal
Shipra Agrawal
56
4
0
18 Jul 2024
Dealing with unbounded gradients in stochastic saddle-point optimization
Dealing with unbounded gradients in stochastic saddle-point optimization
Gergely Neu
Nneka Okolo
37
3
0
21 Feb 2024
Regret Analysis of Policy Gradient Algorithm for Infinite Horizon
  Average Reward Markov Decision Processes
Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes
Qinbo Bai
Washim Uddin Mondal
Vaneet Aggarwal
34
9
0
05 Sep 2023
Learning Optimal Admission Control in Partially Observable Queueing
  Networks
Learning Optimal Admission Control in Partially Observable Queueing Networks
Jonatha Anselmi
B. Gaujal
Louis-Sébastien Rebuffi
34
1
0
04 Aug 2023
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Reinforcement Learning in a Birth and Death Process: Breaking the
  Dependence on the State Space
Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space
Jonatha Anselmi
B. Gaujal
Louis-Sébastien Rebuffi
27
2
0
21 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear
  Function Approximation
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
Uri Sherman
Tomer Koren
Yishay Mansour
32
12
0
30 Jan 2023
Logarithmic regret bounds for continuous-time average-reward Markov
  decision processes
Logarithmic regret bounds for continuous-time average-reward Markov decision processes
Xuefeng Gao
X. Zhou
39
8
0
23 May 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
D. Tiapkin
Denis Belomestny
Eric Moulines
A. Naumov
S. Samsonov
Yunhao Tang
Michal Valko
Pierre Menard
31
17
0
16 May 2022
Learning Infinite-Horizon Average-Reward Markov Decision Processes with
  Constraints
Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints
Liyu Chen
R. Jain
Haipeng Luo
57
25
0
31 Jan 2022
Bad-Policy Density: A Measure of Reinforcement Learning Hardness
Bad-Policy Density: A Measure of Reinforcement Learning Hardness
David Abel
Cameron Allen
Dilip Arumugam
D Ellis Hershkowitz
Michael L. Littman
Lawson L. S. Wong
26
2
0
07 Oct 2021
Understanding Domain Randomization for Sim-to-real Transfer
Understanding Domain Randomization for Sim-to-real Transfer
Xiaoyu Chen
Jiachen Hu
Chi Jin
Lihong Li
Liwei Wang
24
112
0
07 Oct 2021
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Mridul Agarwal
Qinbo Bai
Vaneet Aggarwal
36
12
0
12 Sep 2021
A Survey of Exploration Methods in Reinforcement Learning
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
32
80
0
01 Sep 2021
Policy Optimization in Adversarial MDPs: Improved Exploration via
  Dilated Bonuses
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Haipeng Luo
Chen-Yu Wei
Chung-Wei Lee
38
44
0
18 Jul 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards
  Horizon-Free Regret
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
57
35
0
22 Apr 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
41
0
01 Mar 2021
Online Learning for Unknown Partially Observable MDPs
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
34
20
0
25 Feb 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
38
32
0
29 Dec 2020
A Provably Efficient Sample Collection Strategy for Reinforcement
  Learning
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Jean Tarbouriech
Matteo Pirotta
Michal Valko
A. Lazaric
OffRL
25
16
0
13 Jul 2020
Reinforcement Learning for Non-Stationary Markov Decision Processes: The
  Blessing of (More) Optimism
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
16
93
0
24 Jun 2020
Tightening Exploration in Upper Confidence Reinforcement Learning
Tightening Exploration in Upper Confidence Reinforcement Learning
Hippolyte Bourel
Odalric-Ambrym Maillard
M. S. Talebi
22
31
0
20 Apr 2020
Adaptive Approximate Policy Iteration
Adaptive Approximate Policy Iteration
Botao Hao
N. Lazić
Yasin Abbasi-Yadkori
Pooria Joulani
Csaba Szepesvári
18
14
0
08 Feb 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward
  Markov Decision Processes
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
107
100
0
15 Oct 2019
Regret Minimization for Reinforcement Learning by Evaluating the Optimal
  Bias Function
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
13
71
0
12 Jun 2019
Non-Stationary Reinforcement Learning: The Blessing of (More) Optimism
Non-Stationary Reinforcement Learning: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
23
7
0
07 Jun 2019
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy
  Policies
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
Yonathan Efroni
Nadav Merlis
Mohammad Ghavamzadeh
Shie Mannor
OffRL
22
68
0
27 May 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning
  without Domain Knowledge using Value Function Bounds
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
30
273
0
01 Jan 2019
Exploration Bonus for Regret Minimization in Undiscounted Discrete and
  Continuous Markov Decision Processes
Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes
Jian Qian
Ronan Fruit
Matteo Pirotta
A. Lazaric
14
10
0
11 Dec 2018
Regret Bounds for Reinforcement Learning via Markov Chain Concentration
Regret Bounds for Reinforcement Learning via Markov Chain Concentration
R. Ortner
22
46
0
06 Aug 2018
1