ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.03814
  4. Cited By
Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

9 May 2019
Max Simchowitz
Kevin G. Jamieson
ArXivPDFHTML

Papers citing "Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs"

46 / 46 papers shown
Title
Automatic Reward Shaping from Confounded Offline Data
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
33
0
0
16 May 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
When Do Skills Help Reinforcement Learning? A Theoretical Analysis of
  Temporal Abstractions
When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions
Zhening Li
Gabriel Poesia
Armando Solar-Lezama
OffRL
42
1
0
12 Jun 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
67
1
0
11 Jun 2024
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Xutong Liu
Siwei Wang
Jinhang Zuo
Han Zhong
Xuchuang Wang
Zhiyong Wang
Shuai Li
Mohammad Hajiesmaili
J. C. Lui
Wei Chen
85
1
0
03 Jun 2024
Multiple-policy Evaluation via Density Estimation
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
32
0
0
29 Mar 2024
Reinforcement Learning from Human Feedback with Active Queries
Reinforcement Learning from Human Feedback with Active Queries
Kaixuan Ji
Jiafan He
Quanquan Gu
29
17
0
14 Feb 2024
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Improved Regret Bounds for Linear Adversarial MDPs via Linear
  Optimization
Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization
Fang-yuan Kong
Xiangcheng Zhang
Baoxiang Wang
Shuai Li
31
12
0
14 Feb 2023
Robust Knowledge Transfer in Tiered Reinforcement Learning
Robust Knowledge Transfer in Tiered Reinforcement Learning
Jiawei Huang
Niao He
OffRL
32
1
0
10 Feb 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
52
5
0
05 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
  Processes
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
Heyang Zhao
Dongruo Zhou
Quanquan Gu
OffRL
53
55
0
12 Dec 2022
On Instance-Dependent Bounds for Offline Reinforcement Learning with
  Linear Function Approximation
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
58
16
0
23 Nov 2022
Hardness in Markov Decision Processes: Theory and Practice
Hardness in Markov Decision Processes: Theory and Practice
Michelangelo Conserva
Paulo E. Rauber
39
3
0
24 Oct 2022
Computationally Efficient PAC RL in POMDPs with Latent Determinism and
  Conditional Embeddings
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
Masatoshi Uehara
Ayush Sekhari
Jason D. Lee
Nathan Kallus
Wen Sun
60
6
0
24 Jun 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient
  Learning
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
45
5
0
01 Jun 2022
Logarithmic regret bounds for continuous-time average-reward Markov
  decision processes
Logarithmic regret bounds for continuous-time average-reward Markov decision processes
Xuefeng Gao
X. Zhou
39
8
0
23 May 2022
Provably Efficient Kernelized Q-Learning
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
29
4
0
21 Apr 2022
Offline Reinforcement Learning Under Value and Density-Ratio
  Realizability: The Power of Gaps
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps
Jinglin Chen
Nan Jiang
OffRL
23
34
0
25 Mar 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of
  Stationary Policies
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
30
21
0
24 Mar 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement
  Learning
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
37
5
0
21 Jan 2022
Settling the Horizon-Dependence of Sample Complexity in Reinforcement
  Learning
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
27
20
0
01 Nov 2021
Adaptive Discretization in Online Reinforcement Learning
Adaptive Discretization in Online Reinforcement Learning
Sean R. Sinclair
Siddhartha Banerjee
Chao Yu
OffRL
45
15
0
29 Oct 2021
Reinforcement Learning in Linear MDPs: Constant Regret and
  Representation Selection
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection
Matteo Papini
Andrea Tirinzoni
Aldo Pacchiano
Marcello Restelli
A. Lazaric
Matteo Pirotta
19
18
0
27 Oct 2021
Reinforcement Learning in Reward-Mixing MDPs
Reinforcement Learning in Reward-Mixing MDPs
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
32
15
0
07 Oct 2021
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
33
12
0
11 Aug 2021
Provably Efficient Representation Selection in Low-rank Markov Decision
  Processes: From Online to Offline RL
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL
Weitong Zhang
Jiafan He
Dongruo Zhou
Amy Zhang
Quanquan Gu
OffRL
24
11
0
22 Jun 2021
The best of both worlds: stochastic and adversarial episodic MDPs with
  unknown transition
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
Tiancheng Jin
Longbo Huang
Haipeng Luo
27
40
0
08 Jun 2021
Sample-Efficient Reinforcement Learning Is Feasible for Linearly
  Realizable MDPs with Limited Revisiting
Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting
Gen Li
Yuxin Chen
Yuejie Chi
Yuantao Gu
Yuting Wei
OffRL
26
28
0
17 May 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards
  Horizon-Free Regret
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
65
35
0
22 Apr 2021
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant
  Suboptimality Gap
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap
Yuanhao Wang
Ruosong Wang
Sham Kakade
OffRL
41
43
0
23 Mar 2021
Improved Corruption Robust Algorithms for Episodic Reinforcement
  Learning
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen
S. Du
Kevin G. Jamieson
24
22
0
13 Feb 2021
Confidence-Budget Matching for Sequential Budgeted Learning
Confidence-Budget Matching for Sequential Budgeted Learning
Yonathan Efroni
Nadav Merlis
Aadirupa Saha
Shie Mannor
34
10
0
05 Feb 2021
Fast Rates for the Regret of Offline Reinforcement Learning
Fast Rates for the Regret of Offline Reinforcement Learning
Yichun Hu
Nathan Kallus
Masatoshi Uehara
OffRL
24
30
0
31 Jan 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
43
32
0
29 Dec 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
23
37
0
01 Oct 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal
  Algorithm Escaping the Curse of Horizon
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
34
104
0
28 Sep 2020
Adaptive Discretization for Model-Based Reinforcement Learning
Adaptive Discretization for Model-Based Reinforcement Learning
Sean R. Sinclair
Tianyu Wang
Gauri Jain
Siddhartha Banerjee
Chao Yu
OffRL
19
21
0
01 Jul 2020
$Q$-learning with Logarithmic Regret
QQQ-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
43
59
0
16 Jun 2020
Towards Understanding Cooperative Multi-Agent Q-Learning with Value
  Factorization
Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization
Jianhao Wang
Zhizhou Ren
Beining Han
Jianing Ye
Chongjie Zhang
OffRL
31
32
0
31 May 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
21
221
0
29 Feb 2020
Learning Zero-Sum Simultaneous-Move Markov Games Using Function
  Approximation and Correlated Equilibrium
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
Qiaomin Xie
Yudong Chen
Zhaoran Wang
Zhuoran Yang
41
124
0
17 Feb 2020
Reward-Free Exploration for Reinforcement Learning
Reward-Free Exploration for Reinforcement Learning
Chi Jin
A. Krishnamurthy
Max Simchowitz
Tiancheng Yu
OffRL
112
194
0
07 Feb 2020
Optimism in Reinforcement Learning with Generalized Linear Function
  Approximation
Optimism in Reinforcement Learning with Generalized Linear Function Approximation
Yining Wang
Ruosong Wang
S. Du
A. Krishnamurthy
137
135
0
09 Dec 2019
Is a Good Representation Sufficient for Sample Efficient Reinforcement
  Learning?
Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?
S. Du
Sham Kakade
Ruosong Wang
Lin F. Yang
47
192
0
07 Oct 2019
1