Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.09118
Cited By
Q
Q
Q
-learning with Logarithmic Regret
16 June 2020
Kunhe Yang
Lin F. Yang
S. Du
Re-assign community
ArXiv
PDF
HTML
Papers citing
"$Q$-learning with Logarithmic Regret"
13 / 13 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
68
2
0
10 Oct 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
45
1
0
11 Jun 2024
Reinforcement Learning from Human Feedback with Active Queries
Kaixuan Ji
Jiafan He
Quanquan Gu
8
17
0
14 Feb 2024
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
90
21
0
25 Jul 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
23
10
0
31 Jan 2023
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
19
5
0
01 Jun 2022
No-regret Learning in Repeated First-Price Auctions with Budget Constraints
Rui Ai
Chang Wang
Chenchen Li
Jinshan Zhang
Wenhan Huang
Xiaotie Deng
15
10
0
29 May 2022
Logarithmic regret bounds for continuous-time average-reward Markov decision processes
Xuefeng Gao
X. Zhou
19
8
0
23 May 2022
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps
Jinglin Chen
Nan Jiang
OffRL
14
33
0
25 Mar 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
26
21
0
24 Mar 2022
Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning
Guanlin Liu
Lifeng Lai
AAML
25
34
0
09 Oct 2021
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
13
12
0
11 Aug 2021
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap
Yuanhao Wang
Ruosong Wang
Sham Kakade
OffRL
30
43
0
23 Mar 2021
1