ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.00210
  4. Cited By
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning
  without Domain Knowledge using Value Function Bounds

Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds

1 January 2019
Andrea Zanette
Emma Brunskill
    OffRL
ArXivPDFHTML

Papers citing "Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds"

50 / 216 papers shown
Title
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov
  Decision Processes
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes
Dongruo Zhou
Quanquan Gu
Csaba Szepesvári
19
203
0
15 Dec 2020
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and
  Known Transition
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition
Liyu Chen
Haipeng Luo
Chen-Yu Wei
26
32
0
07 Dec 2020
Accommodating Picky Customers: Regret Bound and Exploration Complexity
  for Multi-Objective Reinforcement Learning
Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
17
10
0
25 Nov 2020
Logarithmic Regret for Reinforcement Learning with Linear Function
  Approximation
Logarithmic Regret for Reinforcement Learning with Linear Function Approximation
Jiafan He
Dongruo Zhou
Quanquan Gu
17
92
0
23 Nov 2020
Value Function Approximations via Kernel Embeddings for No-Regret
  Reinforcement Learning
Value Function Approximations via Kernel Embeddings for No-Regret Reinforcement Learning
Sayak Ray Chowdhury
Rafael Oliveira
OffRL
17
3
0
16 Nov 2020
Experimental Design for Regret Minimization in Linear Bandits
Experimental Design for Regret Minimization in Linear Bandits
Andrew Wagenmaker
Julian Katz-Samuels
Kevin G. Jamieson
9
16
0
01 Nov 2020
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value
  Iteration
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration
Priyank Agrawal
Jinglin Chen
Nan Jiang
27
18
0
23 Oct 2020
Local Differential Privacy for Regret Minimization in Reinforcement
  Learning
Local Differential Privacy for Regret Minimization in Reinforcement Learning
Evrard Garcelon
Vianney Perchet
Ciara Pike-Burke
Matteo Pirotta
26
32
0
15 Oct 2020
Nearly Minimax Optimal Reward-free Reinforcement Learning
Nearly Minimax Optimal Reward-free Reinforcement Learning
Zihan Zhang
S. Du
Xiangyang Ji
OffRL
17
31
0
12 Oct 2020
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds
  Revisited
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited
O. D. Domingues
Pierre Ménard
E. Kaufmann
Michal Valko
8
96
0
07 Oct 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
16
37
0
01 Oct 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal
  Algorithm Escaping the Curse of Horizon
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
12
103
0
28 Sep 2020
A Sample-Efficient Algorithm for Episodic Finite-Horizon MDP with
  Constraints
A Sample-Efficient Algorithm for Episodic Finite-Horizon MDP with Constraints
K. C. Kalagarla
Rahul Jain
Pierluigi Nuzzo
28
52
0
23 Sep 2020
Oracle-Efficient Regret Minimization in Factored MDPs with Unknown
  Structure
Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure
Aviv A. Rosenberg
Yishay Mansour
15
11
0
13 Sep 2020
Improved Exploration in Factored Average-Reward MDPs
Improved Exploration in Factored Average-Reward MDPs
M. S. Talebi
Anders Jonsson
Odalric-Ambrym Maillard
9
8
0
09 Sep 2020
Provably Efficient Reward-Agnostic Navigation with Linear Value
  Iteration
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration
Andrea Zanette
A. Lazaric
Mykel J. Kochenderfer
Emma Brunskill
24
64
0
18 Aug 2020
Reinforcement Learning with Trajectory Feedback
Reinforcement Learning with Trajectory Feedback
Yonathan Efroni
Nadav Merlis
Shie Mannor
14
41
0
13 Aug 2020
Fast active learning for pure exploration in reinforcement learning
Fast active learning for pure exploration in reinforcement learning
Pierre Ménard
O. D. Domingues
Anders Jonsson
E. Kaufmann
Edouard Leurent
Michal Valko
6
94
0
27 Jul 2020
A Provably Efficient Sample Collection Strategy for Reinforcement
  Learning
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Jean Tarbouriech
Matteo Pirotta
Michal Valko
A. Lazaric
OffRL
25
16
0
13 Jul 2020
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in
  Metric Spaces
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces
O. D. Domingues
Pierre Ménard
Matteo Pirotta
E. Kaufmann
Michal Valko
25
40
0
09 Jul 2020
A Unifying View of Optimism in Episodic Reinforcement Learning
A Unifying View of Optimism in Episodic Reinforcement Learning
Gergely Neu
Ciara Pike-Burke
6
66
0
03 Jul 2020
Adaptive Discretization for Model-Based Reinforcement Learning
Adaptive Discretization for Model-Based Reinforcement Learning
Sean R. Sinclair
Tianyu Wang
Gauri Jain
Siddhartha Banerjee
Chao Yu
OffRL
19
21
0
01 Jul 2020
Model-based Reinforcement Learning: A Survey
Model-based Reinforcement Learning: A Survey
Thomas M. Moerland
Joost Broekens
Aske Plaat
Catholijn M. Jonker
OffRL
33
47
0
30 Jun 2020
Lookahead-Bounded Q-Learning
Lookahead-Bounded Q-Learning
Ibrahim El Shar
Daniel R. Jiang
VLM
16
8
0
28 Jun 2020
Towards Minimax Optimal Reinforcement Learning in Factored Markov
  Decision Processes
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes
Yi Tian
Jian Qian
S. Sra
8
25
0
24 Jun 2020
Stochastic Shortest Path with Adversarially Changing Costs
Stochastic Shortest Path with Adversarially Changing Costs
Aviv A. Rosenberg
Yishay Mansour
AAML
24
33
0
20 Jun 2020
Provably adaptive reinforcement learning in metric spaces
Provably adaptive reinforcement learning in metric spaces
Tongyi Cao
A. Krishnamurthy
17
7
0
18 Jun 2020
Task-agnostic Exploration in Reinforcement Learning
Task-agnostic Exploration in Reinforcement Learning
Xuezhou Zhang
Yuzhe Ma
Adish Singla
OffRL
20
49
0
16 Jun 2020
$Q$-learning with Logarithmic Regret
QQQ-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
43
59
0
16 Jun 2020
Preference-based Reinforcement Learning with Finite-Time Guarantees
Preference-based Reinforcement Learning with Finite-Time Guarantees
Yichong Xu
Ruosong Wang
Lin F. Yang
Aarti Singh
A. Dubrawski
23
53
0
16 Jun 2020
Efficient Model-Based Reinforcement Learning through Optimistic Policy
  Search and Planning
Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning
Sebastian Curi
Felix Berkenkamp
Andreas Krause
33
82
0
15 Jun 2020
Adaptive Reward-Free Exploration
Adaptive Reward-Free Exploration
E. Kaufmann
Pierre Ménard
O. D. Domingues
Anders Jonsson
Edouard Leurent
Michal Valko
30
80
0
11 Jun 2020
Planning in Markov Decision Processes with Gap-Dependent Sample
  Complexity
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
Anders Jonsson
E. Kaufmann
Pierre Ménard
O. D. Domingues
Edouard Leurent
Michal Valko
9
31
0
10 Jun 2020
A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret
Mehdi Jafarnia-Jahromi
Chen-Yu Wei
Rahul Jain
Haipeng Luo
20
7
0
08 Jun 2020
MOPO: Model-based Offline Policy Optimization
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
25
754
0
27 May 2020
Reinforcement Learning with General Value Function Approximation:
  Provably Efficient Approach via Bounded Eluder Dimension
Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension
Ruosong Wang
Ruslan Salakhutdinov
Lin F. Yang
23
55
0
21 May 2020
Reinforcement Learning with Feedback Graphs
Reinforcement Learning with Feedback Graphs
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
14
9
0
07 May 2020
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon
  Reinforcement Learning?
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?
Ruosong Wang
S. Du
Lin F. Yang
Sham Kakade
OffRL
10
52
0
01 May 2020
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage
  Decomposition
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition
Zihan Zhang
Yuanshuo Zhou
Xiangyang Ji
OffRL
8
155
0
21 Apr 2020
Tightening Exploration in Upper Confidence Reinforcement Learning
Tightening Exploration in Upper Confidence Reinforcement Learning
Hippolyte Bourel
Odalric-Ambrym Maillard
M. S. Talebi
22
31
0
20 Apr 2020
Kernel-Based Reinforcement Learning: A Finite-Time Analysis
Kernel-Based Reinforcement Learning: A Finite-Time Analysis
O. D. Domingues
Pierre Ménard
Matteo Pirotta
E. Kaufmann
Michal Valko
9
18
0
12 Apr 2020
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
K. Khamaru
A. Pananjady
Feng Ruan
Martin J. Wainwright
Michael I. Jordan
OffRL
11
47
0
16 Mar 2020
Provably Efficient Exploration for Reinforcement Learning Using
  Unsupervised Learning
Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning
Fei Feng
Ruosong Wang
W. Yin
S. Du
Lin F. Yang
OffRL
SSL
30
7
0
15 Mar 2020
Exploration-Exploitation in Constrained MDPs
Exploration-Exploitation in Constrained MDPs
Yonathan Efroni
Shie Mannor
Matteo Pirotta
33
169
0
04 Mar 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
17
221
0
29 Feb 2020
Near-optimal Regret Bounds for Stochastic Shortest Path
Near-optimal Regret Bounds for Stochastic Shortest Path
Alon Cohen
Haim Kaplan
Yishay Mansour
Aviv A. Rosenberg
12
53
0
23 Feb 2020
Optimistic Policy Optimization with Bandit Feedback
Optimistic Policy Optimization with Bandit Feedback
Yonathan Efroni
Lior Shani
Aviv A. Rosenberg
Shie Mannor
13
90
0
19 Feb 2020
Learning Zero-Sum Simultaneous-Move Markov Games Using Function
  Approximation and Correlated Equilibrium
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
Qiaomin Xie
Yudong Chen
Zhaoran Wang
Zhuoran Yang
39
124
0
17 Feb 2020
Regret Bounds for Discounted MDPs
Regret Bounds for Discounted MDPs
Shuang Liu
H. Su
OffRL
8
19
0
12 Feb 2020
Provable Self-Play Algorithms for Competitive Reinforcement Learning
Provable Self-Play Algorithms for Competitive Reinforcement Learning
Yu Bai
Chi Jin
SSL
22
148
0
10 Feb 2020
Previous
12345
Next