Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10019
Cited By
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition
21 April 2020
Zihan Zhang
Yuanshuo Zhou
Xiangyang Ji
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition"
50 / 52 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm
Miao Lu
Han Zhong
Tong Zhang
Jose H. Blanchet
OffRL
OOD
79
6
0
04 Apr 2024
Horizon-Free Regret for Linear Markov Decision Processes
Zihan Zhang
Jason D. Lee
Yuxin Chen
Simon S. Du
33
3
0
15 Mar 2024
Batched Nonparametric Contextual Bandits
Rong Jiang
Cong Ma
OffRL
39
1
0
27 Feb 2024
Multi-Agent Probabilistic Ensembles with Trajectory Sampling for Connected Autonomous Vehicles
Ruoqi Wen
Jiahao Huang
Rongpeng Li
Guoru Ding
Zhifeng Zhao
37
1
0
21 Dec 2023
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
Ruiqi Zhang
Andrea Zanette
OffRL
OnRL
42
7
0
10 Jul 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
38
3
0
17 Jun 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
32
7
0
24 May 2023
Fast Rates for Maximum Entropy Exploration
D. Tiapkin
Denis Belomestny
Daniele Calandriello
Eric Moulines
Rémi Munos
A. Naumov
Pierre Perrault
Yunhao Tang
Michal Valko
Pierre Menard
46
18
0
14 Mar 2023
Provably Efficient Reinforcement Learning via Surprise Bound
Hanlin Zhu
Ruosong Wang
Jason D. Lee
OffRL
28
5
0
22 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi Ma
Sergey Levine
37
14
0
24 Dec 2022
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
Heyang Zhao
Dongruo Zhou
Quanquan Gu
OffRL
51
53
0
12 Dec 2022
A Unified Algorithm for Stochastic Path Problems
Christoph Dann
Chen-Yu Wei
Julian Zimmert
35
0
0
17 Oct 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
26
9
0
15 Oct 2022
Multi-armed Bandit Learning on a Graph
Tianpeng Zhang
Kasper Johansson
Na Li
33
6
0
20 Sep 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
40
5
0
01 Jun 2022
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Yiding Chen
Xuezhou Zhang
Kaipeng Zhang
Mengdi Wang
Xiaojin Zhu
OffRL
26
16
0
01 Jun 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
D. Tiapkin
Denis Belomestny
Eric Moulines
A. Naumov
S. Samsonov
Yunhao Tang
Michal Valko
Pierre Menard
31
17
0
16 May 2022
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
27
4
0
21 Apr 2022
Jump-Start Reinforcement Learning
Ikechukwu Uchendu
Ted Xiao
Yao Lu
Banghua Zhu
Mengyuan Yan
...
Chuyuan Fu
Cong Ma
Jiantao Jiao
Sergey Levine
Karol Hausman
OffRL
OnRL
44
109
0
05 Apr 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
30
21
0
24 Mar 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Branching Reinforcement Learning
Yihan Du
Wei Chen
27
0
0
16 Feb 2022
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
Dan Qiao
Ming Yin
Ming Min
Yu Wang
43
28
0
13 Feb 2022
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Tianhao Wu
Yunchang Yang
Han Zhong
Liwei Wang
S. Du
Jiantao Jiao
55
14
0
21 Dec 2021
Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure
Hsu Kao
Chen-Yu Wei
V. Subramanian
23
12
0
01 Nov 2021
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
27
20
0
01 Nov 2021
Learning Stochastic Shortest Path with Linear Function Approximation
Steffen Czolbe
Jiafan He
Adrian Dalca
Quanquan Gu
41
30
0
25 Oct 2021
On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
Weichao Mao
Lin F. Yang
Kaipeng Zhang
Tamer Bacsar
43
57
0
12 Oct 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
47
51
0
09 Oct 2021
On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games
Xiaotie Deng
Ningyuan Li
D. Mguni
Jun Wang
Yaodong Yang
23
46
0
04 Sep 2021
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
Chi Jin
Qinghua Liu
Tiancheng Yu
26
50
0
07 Jun 2021
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing
Anshumali Shrivastava
Zhao Song
Zhaozhuo Xu
19
22
0
18 May 2021
Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting
Gen Li
Yuxin Chen
Yuejie Chi
Yuantao Gu
Yuting Wei
OffRL
26
28
0
17 May 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
62
35
0
22 Apr 2021
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap
Yuanhao Wang
Ruosong Wang
Sham Kakade
OffRL
39
43
0
23 Mar 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
41
0
01 Mar 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
52
75
0
12 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
35
215
0
01 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
38
0
29 Jan 2021
Online Learning in Unknown Markov Games
Yi Tian
Yuanhao Wang
Tiancheng Yu
S. Sra
OffRL
17
13
0
28 Oct 2020
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
Qinghua Liu
Tiancheng Yu
Yu Bai
Chi Jin
32
121
0
04 Oct 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
21
37
0
01 Oct 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
17
104
0
28 Sep 2020
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Jean Tarbouriech
Matteo Pirotta
Michal Valko
A. Lazaric
OffRL
25
16
0
13 Jul 2020
Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design
Yufei Ruan
Jiaqi Yang
Yuanshuo Zhou
OffRL
100
51
0
04 Jul 2020
Task-agnostic Exploration in Reinforcement Learning
Xuezhou Zhang
Yuzhe Ma
Adish Singla
OffRL
28
49
0
16 Jun 2020
Q
Q
Q
-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
43
59
0
16 Jun 2020
1
2
Next