Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.13503
Cited By
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
28 September 2020
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon"
32 / 82 papers shown
Title
A Benchmark for Low-Switching-Cost Reinforcement Learning
Shusheng Xu
Yancheng Liang
Yunfei Li
S. Du
Yi Wu
OffRL
22
0
0
13 Dec 2021
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Andrew Wagenmaker
Yifang Chen
Max Simchowitz
S. Du
Kevin G. Jamieson
73
37
0
07 Dec 2021
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning
Tongzheng Ren
Tianjun Zhang
Csaba Szepesvári
Bo Dai
27
19
0
22 Nov 2021
Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs
Yeoneung Kim
Insoon Yang
Kwang-Sung Jun
20
36
0
05 Nov 2021
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
27
20
0
01 Nov 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Ming Yin
Yu Wang
OffRL
29
82
0
17 Oct 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
47
51
0
09 Oct 2021
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
33
12
0
11 Aug 2021
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning
Andrew Wagenmaker
Max Simchowitz
Kevin G. Jamieson
12
34
0
05 Aug 2021
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning
Yunchang Yang
Tianhao Wu
Han Zhong
Evrard Garcelon
Matteo Pirotta
A. Lazaric
Liwei Wang
S. Du
OffRL
35
9
0
22 Jun 2021
MADE: Exploration via Maximizing Deviation from Explored Regions
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
34
42
0
18 Jun 2021
Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path
Liyu Chen
Mehdi Jafarnia-Jahromi
R. Jain
Haipeng Luo
24
25
0
15 Jun 2021
Online Sub-Sampling for Reinforcement Learning with General Function Approximation
Dingwen Kong
Ruslan Salakhutdinov
Ruosong Wang
Lin F. Yang
OffRL
38
1
0
14 Jun 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
32
19
0
13 May 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
62
35
0
22 Apr 2021
Nearly Horizon-Free Offline Reinforcement Learning
Tongzheng Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
32
49
0
25 Mar 2021
Minimax Regret for Stochastic Shortest Path
Alon Cohen
Yonathan Efroni
Yishay Mansour
Aviv A. Rosenberg
31
28
0
24 Mar 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
32
53
0
24 Mar 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
41
0
01 Mar 2021
Learning to Stop with Surprisingly Few Samples
Daniel Russo
A. Zeevi
Tianyi Zhang
15
1
0
19 Feb 2021
Near-Optimal Randomized Exploration for Tabular Markov Decision Processes
Zhihan Xiong
Ruoqi Shen
Qiwen Cui
Maryam Fazel
S. Du
21
7
0
19 Feb 2021
Causal Markov Decision Processes: Learning Good Interventions Efficiently
Yangyi Lu
A. Meisami
Ambuj Tewari
23
10
0
15 Feb 2021
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen
S. Du
Kevin G. Jamieson
24
22
0
13 Feb 2021
Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States
Shi Dong
Benjamin Van Roy
Zhengyuan Zhou
32
29
0
10 Feb 2021
Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap
Haike Xu
Tengyu Ma
S. Du
11
42
0
09 Feb 2021
Near-optimal Representation Learning for Linear Bandits and Linear RL
Jiachen Hu
Xiaoyu Chen
Chi Jin
Lihong Li
Liwei Wang
OffRL
20
51
0
08 Feb 2021
Confidence-Budget Matching for Sequential Budgeted Learning
Yonathan Efroni
Nadav Merlis
Aadirupa Saha
Shie Mannor
19
10
0
05 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
38
0
29 Jan 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints
Chi Jin
Zhuoran Yang
Zhaoran Wang
OffRL
122
167
0
06 Jan 2021
A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost
Minbo Gao
Tianle Xie
S. Du
Lin F. Yang
36
46
0
02 Jan 2021
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes
Dongruo Zhou
Quanquan Gu
Csaba Szepesvári
27
204
0
15 Dec 2020
Nearly Minimax Optimal Reward-free Reinforcement Learning
Zihan Zhang
S. Du
Xiangyang Ji
OffRL
25
31
0
12 Oct 2020
Previous
1
2