Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1902.02234
Cited By
v1
v2
v3 (latest)
Finite-Sample Analysis for SARSA with Linear Function Approximation
6 February 2019
Shaofeng Zou
Tengyu Xu
Yingbin Liang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Finite-Sample Analysis for SARSA with Linear Function Approximation"
50 / 101 papers shown
Title
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
457
229
0
08 Dec 2021
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation
Yixuan Lin
V. Gupta
Ji Liu
276
3
0
24 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Journal of machine learning research (JMLR), 2021
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
324
16
0
04 Nov 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
AAAI Conference on Artificial Intelligence (AAAI), 2021
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
299
5
0
30 Oct 2021
Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic Algorithm for Constrained Markov Decision Processes
IEEE Conference on Decision and Control (CDC), 2021
Sihan Zeng
Thinh T. Doan
Justin Romberg
298
21
0
21 Oct 2021
Actor-critic is implicitly biased towards high entropy optimal policies
International Conference on Learning Representations (ICLR), 2021
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
175
12
0
21 Oct 2021
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process
Operations Research Letters (ORL), 2021
Tianjiao Li
Ziwei Guan
Shaofeng Zou
Tengyu Xu
Yingbin Liang
Guanghui Lan
158
36
0
20 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
417
3
0
17 Oct 2021
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs
Naman Agarwal
Syomantak Chaudhuri
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
OffRL
283
22
0
16 Oct 2021
Sim and Real: Better Together
Shirli Di-Castro Shashua
Dotan DiCastro
Shie Mannor
209
11
0
01 Oct 2021
A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning
Sihan Zeng
Thinh T. Doan
Justin Romberg
397
30
0
29 Sep 2021
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
338
133
0
29 Sep 2021
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Journal of machine learning research (JMLR), 2021
Shangtong Zhang
Shimon Whiteson
OffRL
235
15
0
11 Aug 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Anas Barakat
Pascal Bianchi
Julien Lehmann
181
13
0
14 Jun 2021
Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning
IEEE Transactions on Signal Processing (IEEE TSP), 2021
Chang Tian
An Liu
Guang-Li Huang
Wu Luo
97
15
0
26 May 2021
Deeply-Debiased Off-Policy Interval Estimation
International Conference on Machine Learning (ICML), 2021
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
193
43
0
10 May 2021
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
C. Bowyer
67
1
0
15 Apr 2021
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
Neural Information Processing Systems (NeurIPS), 2021
Yue Wang
Shaofeng Zou
Yi Zhou
375
11
0
07 Apr 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
International Conference on Learning Representations (ICLR), 2021
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
160
12
0
30 Mar 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity
Ziyi Chen
Yi Zhou
Rongrong Chen
OffRL
171
7
0
24 Mar 2021
Breaking the Deadly Triad with a Target Network
International Conference on Machine Learning (ICML), 2021
Shangtong Zhang
Hengshuai Yao
Shimon Whiteson
AAML
568
55
0
21 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
IEEE Transactions on Signal Processing (TSP), 2020
Han Shen
Jianchao Tan
Min-Fong Hong
Tianyi Chen
308
43
0
31 Dec 2020
On Convergence of Gradient Expected Sarsa(
λ
λ
λ
)
AAAI Conference on Artificial Intelligence (AAAI), 2020
Long Yang
Gang Zheng
Yu Zhang
Qian Zheng
Pengfei Li
Gang Pan
215
4
0
14 Dec 2020
Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee
Minghao Han
Yuan Tian
Lixian Zhang
Jun Wang
Wei Pan
124
56
0
13 Nov 2020
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms
Tengyu Xu
Yingbin Liang
236
27
0
10 Nov 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
Neural Information Processing Systems (NeurIPS), 2020
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
270
17
0
26 Oct 2020
Finite-Time Analysis for Double Q-learning
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
170
32
0
29 Sep 2020
Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis
Arunselvan Ramaswamy
Eyke Hüllermeier
86
4
0
25 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
International Conference on Learning Representations (ICLR), 2020
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
236
48
0
02 Aug 2020
Momentum Q-learning with Finite-Sample Convergence Guarantee
Bowen Weng
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
174
8
0
30 Jul 2020
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent
International Joint Conference on Artificial Intelligence (IJCAI), 2020
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
112
24
0
15 Jul 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
International Conference on Machine Learning (ICML), 2020
Dongruo Zhou
Jiafan He
Quanquan Gu
301
141
0
23 Jun 2020
Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation
Neural Information Processing Systems (NeurIPS), 2020
Devavrat Shah
Dogyoon Song
Zhi Xu
Yuzhe Yang
275
33
0
11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
663
11
0
08 Jun 2020
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
OffRL
380
126
0
04 Jun 2020
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise
Yue Wang
Shaofeng Zou
142
22
0
20 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
244
63
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Neural Information Processing Systems (NeurIPS), 2020
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
334
167
0
04 May 2020
Actor-Critic Reinforcement Learning for Control with Stability Guarantee
IEEE Robotics and Automation Letters (RA-L), 2020
Minghao Han
Lixian Zhang
Jun Wang
Wei Pan
261
136
0
29 Apr 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
311
25
0
27 Apr 2020
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity
S. Du
Jason D. Lee
G. Mahajan
Ruosong Wang
110
39
0
17 Feb 2020
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling
AAAI Conference on Artificial Intelligence (AAAI), 2020
Huaqing Xiong
Tengyu Xu
Yingbin Liang
Wei Zhang
171
35
0
15 Feb 2020
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
International Conference on Machine Learning (ICML), 2020
C. Shi
Runzhe Wan
R. Song
Wenbin Lu
Ling Leng
162
41
0
05 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework
Journal of the American Statistical Association (JASA), 2020
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CML
OffRL
442
48
0
05 Feb 2020
Reanalysis of Variance Reduced Temporal Difference Learning
International Conference on Learning Representations (ICLR), 2020
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
OffRL
321
44
0
07 Jan 2020
Scalable Reinforcement Learning for Multi-Agent Networked Systems
Operational Research (OR), 2019
Guannan Qu
Adam Wierman
Na Li
244
43
0
05 Dec 2019
A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms
Dong-hwan Lee
Niao He
332
28
0
04 Dec 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Jianchao Tan
Zhuoran Yang
Tamer Basar
527
1,440
0
24 Nov 2019
Generalized Speedy Q-learning
IEEE Control Systems Letters (L-CSS), 2019
I. John
Chandramouli Kamanchi
S. Bhatnagar
119
20
0
01 Nov 2019
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Machine-mediated learning (ML), 2019
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
301
95
0
18 Oct 2019
Previous
1
2
3
Next