Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.13503
Cited By
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
28 September 2020
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon"
50 / 82 papers shown
Title
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
Runze Zhao
Yue Yu
Adams Yiyue Zhu
Chen Yang
Dongruo Zhou
7
0
0
20 May 2025
When a Reinforcement Learning Agent Encounters Unknown Unknowns
Juntian Zhu
Miguel de Carvalho
Zhouwang Yang
Fengxiang He
7
0
0
19 May 2025
Minimax Optimal Reinforcement Learning with Quasi-Optimism
Harin Lee
Min-hwan Oh
OffRL
64
1
0
02 Mar 2025
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRL
OnRL
38
0
0
06 Nov 2024
Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs
Davide Maran
Alberto Maria Metelli
Matteo Papini
Marcello Restelli
39
0
0
31 Oct 2024
Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents
Safwan Labbi
D. Tiapkin
Lorenzo Mancini
Paul Mangold
Eric Moulines
FedML
73
0
0
30 Oct 2024
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
State-free Reinforcement Learning
Mingyu Chen
Aldo Pacchiano
Xuezhou Zhang
68
0
0
27 Sep 2024
Efficient Reinforcement Learning in Probabilistic Reward Machines
Xiaofeng Lin
Xuezhou Zhang
56
0
0
19 Aug 2024
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
D. Tiapkin
Evgenii Chzhen
Gilles Stoltz
74
1
0
08 Jul 2024
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Xutong Liu
Siwei Wang
Jinhang Zuo
Han Zhong
Xuchuang Wang
Zhiyong Wang
Shuai Li
Mohammad Hajiesmaili
J. C. Lui
Wei Chen
85
1
0
03 Jun 2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off
Itai Shufaro
Nadav Merlis
Nir Weinberger
Shie Mannor
38
0
0
26 May 2024
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm
Miao Lu
Han Zhong
Tong Zhang
Jose H. Blanchet
OffRL
OOD
79
6
0
04 Apr 2024
Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning
Srinjoy Roy
Swagatam Das
32
0
0
31 Mar 2024
The Value of Reward Lookahead in Reinforcement Learning
Nadav Merlis
Dorian Baudry
Vianney Perchet
34
0
0
18 Mar 2024
Horizon-Free Regret for Linear Markov Decision Processes
Zihan Zhang
Jason D. Lee
Yuxin Chen
Simon S. Du
33
3
0
15 Mar 2024
Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks
Ziping Xu
Zifan Xu
Runxuan Jiang
Peter Stone
Ambuj Tewari
48
1
0
03 Mar 2024
Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity
Guhao Feng
Han Zhong
OffRL
76
2
0
28 Dec 2023
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation
Jiayi Huang
Han Zhong
Liwei Wang
Lin F. Yang
39
2
0
07 Dec 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
32
5
0
09 Oct 2023
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning
Qiwei Di
Heyang Zhao
Jiafan He
Quanquan Gu
OffRL
61
5
0
02 Oct 2023
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty
Guanin Liu
Zhihan Zhou
Han Liu
Lifeng Lai
36
1
0
15 Jul 2023
Active Coverage for PAC Reinforcement Learning
Aymen Al Marjani
Andrea Tirinzoni
E. Kaufmann
OffRL
21
4
0
23 Jun 2023
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs
Kaixuan Ji
Qingyue Zhao
Jiafan He
Weitong Zhang
Q. Gu
55
4
0
15 May 2023
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
Junkai Zhang
Weitong Zhang
Quanquan Gu
33
3
0
17 Mar 2023
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
Han Zhong
Jiachen Hu
Yecheng Xue
Tongyang Li
Liwei Wang
26
5
0
21 Feb 2023
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
Heyang Zhao
Jiafan He
Dongruo Zhou
Tong Zhang
Quanquan Gu
42
27
0
21 Feb 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
27
8
0
03 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
Heyang Zhao
Dongruo Zhou
Quanquan Gu
OffRL
51
53
0
12 Dec 2022
Beyond the Best: Estimating Distribution Functionals in Infinite-Armed Bandits
Yifei Wang
Tavor Z. Baharav
Yanjun Han
Jiantao Jiao
David Tse
11
1
0
01 Nov 2022
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
Runlong Zhou
Ruosong Wang
S. Du
31
3
0
20 Oct 2022
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
Haotian Ye
Xiaoyu Chen
Liwei Wang
S. Du
OffRL
37
6
0
19 Oct 2022
A Unified Algorithm for Stochastic Path Problems
Christoph Dann
Chen-Yu Wei
Julian Zimmert
35
0
0
17 Oct 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
26
9
0
15 Oct 2022
Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
Liyu Chen
Andrea Tirinzoni
Matteo Pirotta
A. Lazaric
31
3
0
10 Oct 2022
The Role of Coverage in Online Reinforcement Learning
Tengyang Xie
Dylan J. Foster
Yu Bai
Nan Jiang
Sham Kakade
OffRL
38
57
0
09 Oct 2022
Sampling Through the Lens of Sequential Decision Making
J. Dou
Alvin Pan
Runxue Bao
Haiyi Mao
Lei Luo
Zhi-Hong Mao
26
19
0
17 Aug 2022
Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design
Andrew Wagenmaker
Kevin G. Jamieson
OffRL
32
25
0
06 Jul 2022
Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments
Liyu Chen
Haipeng Luo
35
8
0
25 May 2022
Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs
Dongruo Zhou
Quanquan Gu
81
44
0
23 May 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs
Ian A. Kash
L. Reyzin
Zishun Yu
31
0
0
18 May 2022
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
27
4
0
21 Apr 2022
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Aviral Kumar
Joey Hong
Anika Singh
Sergey Levine
OffRL
45
77
0
12 Apr 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
30
21
0
24 Mar 2022
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
38
90
0
28 Feb 2022
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes
Andrew Wagenmaker
Yifang Chen
Max Simchowitz
S. Du
Kevin G. Jamieson
19
48
0
26 Jan 2022
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Tianhao Wu
Yunchang Yang
Han Zhong
Liwei Wang
S. Du
Jiantao Jiao
55
14
0
21 Dec 2021
Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP
Liyu Chen
Rahul Jain
Haipeng Luo
43
14
0
18 Dec 2021
1
2
Next