Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.04697
Cited By
Variance-reduced
Q
Q
Q
-learning is minimax optimal
11 June 2019
Martin J. Wainwright
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Variance-reduced $Q$-learning is minimax optimal"
29 / 29 papers shown
Title
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Lixing Lyu
Jiashuo Jiang
Wang Chi Cheung
44
1
0
24 Feb 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
Stochastic Halpern iteration in normed spaces and applications to reinforcement learning
Mario Bravo
Juan Pablo Contreras
48
3
0
19 Mar 2024
Span-Based Optimal Sample Complexity for Average Reward MDPs
M. Zurek
Yudong Chen
41
7
0
22 Nov 2023
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
37
7
0
24 May 2023
A Finite Sample Complexity Bound for Distributionally Robust Q-learning
Shengbo Wang
Nian Si
Jose H. Blanchet
Zhengyuan Zhou
OOD
OffRL
48
24
0
26 Feb 2023
Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes
Di Wang
Yao Wang
Shaojie Tang
OffRL
26
1
0
21 Feb 2023
Near Sample-Optimal Reduction-based Policy Learning for Average Reward MDP
Jinghan Wang
Meng-Xian Wang
Lin F. Yang
37
16
0
01 Dec 2022
Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model
Gen Li
Yuejie Chi
Yuting Wei
Yuxin Chen
37
18
0
22 Aug 2022
Algorithm for Constrained Markov Decision Process with Linear Convergence
E. Gladin
Maksim Lavrik-Karmazin
K. Zainullina
Varvara Rudenko
Alexander V. Gasnikov
Martin Takáč
38
6
0
03 Jun 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
45
5
0
01 Jun 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Target Network and Truncation Overcome The Deadly Triad in
Q
Q
Q
-Learning
Zaiwei Chen
John-Paul Clarke
S. T. Maguluri
28
19
0
05 Mar 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
37
5
0
21 Jan 2022
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning
Xiang Li
Wenhao Yang
Jiadong Liang
Zhihua Zhang
Michael I. Jordan
48
15
0
29 Dec 2021
Accelerated and instance-optimal policy evaluation with linear function approximation
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
44
13
0
24 Dec 2021
Quantum Algorithms for Reinforcement Learning with a Generative Model
Daochen Wang
Aarthi Sundaram
Robin Kothari
Ashish Kapoor
M. Rötteler
37
27
0
15 Dec 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
49
51
0
09 Oct 2021
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
97
0
29 Sep 2021
Concentration of Contractive Stochastic Approximation and Reinforcement Learning
Siddharth Chandak
Vivek Borkar
Parth Dodhia
48
17
0
27 Jun 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
17
11
0
30 Mar 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
55
75
0
12 Feb 2021
Optimal oracle inequalities for solving projected fixed-point equations
Wenlong Mou
A. Pananjady
Martin J. Wainwright
29
14
0
09 Dec 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
23
37
0
01 Oct 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
39
125
0
26 May 2020
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
Wenlong Mou
C. J. Li
Martin J. Wainwright
Peter L. Bartlett
Michael I. Jordan
33
75
0
09 Apr 2020
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
K. Khamaru
A. Pananjady
Feng Ruan
Martin J. Wainwright
Michael I. Jordan
OffRL
27
47
0
16 Mar 2020
Provably Efficient Reinforcement Learning with Linear Function Approximation
Chi Jin
Zhuoran Yang
Zhaoran Wang
Michael I. Jordan
52
541
0
11 Jul 2019
1