ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.02234
  4. Cited By
Finite-Sample Analysis for SARSA with Linear Function Approximation
v1v2v3 (latest)

Finite-Sample Analysis for SARSA with Linear Function Approximation

6 February 2019
Shaofeng Zou
Tengyu Xu
Yingbin Liang
ArXiv (abs)PDFHTML

Papers citing "Finite-Sample Analysis for SARSA with Linear Function Approximation"

50 / 101 papers shown
Title
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
457
229
0
08 Dec 2021
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation
Yixuan Lin
V. Gupta
Ji Liu
276
3
0
24 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution MismatchJournal of machine learning research (JMLR), 2021
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
324
16
0
04 Nov 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth
  Settings
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth SettingsAAAI Conference on Artificial Intelligence (AAAI), 2021
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
299
5
0
30 Oct 2021
Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic
  Algorithm for Constrained Markov Decision Processes
Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic Algorithm for Constrained Markov Decision ProcessesIEEE Conference on Decision and Control (CDC), 2021
Sihan Zeng
Thinh T. Doan
Justin Romberg
298
21
0
21 Oct 2021
Actor-critic is implicitly biased towards high entropy optimal policies
Actor-critic is implicitly biased towards high entropy optimal policiesInternational Conference on Learning Representations (ICLR), 2021
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
175
12
0
21 Oct 2021
Faster Algorithm and Sharper Analysis for Constrained Markov Decision
  Process
Faster Algorithm and Sharper Analysis for Constrained Markov Decision ProcessOperations Research Letters (ORL), 2021
Tianjiao Li
Ziwei Guan
Shaofeng Zou
Tengyu Xu
Yingbin Liang
Guanghui Lan
158
36
0
20 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
417
3
0
17 Oct 2021
Online Target Q-learning with Reverse Experience Replay: Efficiently
  finding the Optimal Policy for Linear MDPs
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs
Naman Agarwal
Syomantak Chaudhuri
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
OffRL
283
22
0
16 Oct 2021
Sim and Real: Better Together
Sim and Real: Better Together
Shirli Di-Castro Shashua
Dotan DiCastro
Shie Mannor
209
11
0
01 Oct 2021
A Two-Time-Scale Stochastic Optimization Framework with Applications in
  Control and Reinforcement Learning
A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning
Sihan Zeng
Thinh T. Doan
Justin Romberg
397
30
0
29 Sep 2021
Online Robust Reinforcement Learning with Model Uncertainty
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OODOffRL
338
133
0
29 Sep 2021
Truncated Emphatic Temporal Difference Methods for Prediction and
  Control
Truncated Emphatic Temporal Difference Methods for Prediction and ControlJournal of machine learning research (JMLR), 2021
Shangtong Zhang
Shimon Whiteson
OffRL
235
15
0
11 Aug 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function
  Approximation
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function ApproximationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Anas Barakat
Pascal Bianchi
Julien Lehmann
181
13
0
14 Jun 2021
Successive Convex Approximation Based Off-Policy Optimization for
  Constrained Reinforcement Learning
Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement LearningIEEE Transactions on Signal Processing (IEEE TSP), 2021
Chang Tian
An Liu
Guang-Li Huang
Wu Luo
97
15
0
26 May 2021
Deeply-Debiased Off-Policy Interval Estimation
Deeply-Debiased Off-Policy Interval EstimationInternational Conference on Machine Learning (ICML), 2021
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
193
43
0
10 May 2021
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
C. Bowyer
67
1
0
15 Apr 2021
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth
  Function Approximation
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function ApproximationNeural Information Processing Systems (NeurIPS), 2021
Yue Wang
Shaofeng Zou
Yi Zhou
375
11
0
07 Apr 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved
  Complexity
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved ComplexityInternational Conference on Learning Representations (ICLR), 2021
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
160
12
0
30 Mar 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with
  Near-Optimal Sample Complexity and Communication Complexity
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity
Ziyi Chen
Yi Zhou
Rongrong Chen
OffRL
171
7
0
24 Mar 2021
Breaking the Deadly Triad with a Target Network
Breaking the Deadly Triad with a Target NetworkInternational Conference on Machine Learning (ICML), 2021
Shangtong Zhang
Hengshuai Yao
Shimon Whiteson
AAML
568
55
0
21 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence
  and Linear Speedup
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear SpeedupIEEE Transactions on Signal Processing (TSP), 2020
Han Shen
Jianchao Tan
Min-Fong Hong
Tianyi Chen
308
43
0
31 Dec 2020
On Convergence of Gradient Expected Sarsa($λ$)
On Convergence of Gradient Expected Sarsa(λλλ)AAAI Conference on Artificial Intelligence (AAAI), 2020
Long Yang
Gang Zheng
Yu Zhang
Qian Zheng
Pengfei Li
Gang Pan
215
4
0
14 Dec 2020
Reinforcement Learning Control of Constrained Dynamic Systems with
  Uniformly Ultimate Boundedness Stability Guarantee
Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee
Minghao Han
Yuan Tian
Lixian Zhang
Jun Wang
Wei Pan
124
56
0
13 Nov 2020
Sample Complexity Bounds for Two Timescale Value-based Reinforcement
  Learning Algorithms
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms
Tengyu Xu
Yingbin Liang
236
27
0
10 Nov 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence
  Analysis
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence AnalysisNeural Information Processing Systems (NeurIPS), 2020
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
270
17
0
26 Oct 2020
Finite-Time Analysis for Double Q-learning
Finite-Time Analysis for Double Q-learning
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
170
32
0
29 Sep 2020
Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis
Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis
Arunselvan Ramaswamy
Eyke Hüllermeier
86
4
0
25 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal PolicyInternational Conference on Learning Representations (ICLR), 2020
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
236
48
0
02 Aug 2020
Momentum Q-learning with Finite-Sample Convergence Guarantee
Momentum Q-learning with Finite-Sample Convergence Guarantee
Bowen Weng
Huaqing Xiong
Linna Zhao
Yingbin Liang
Wei Zhang
174
8
0
30 Jul 2020
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient
  Descent
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient DescentInternational Joint Conference on Artificial Intelligence (IJCAI), 2020
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
112
24
0
15 Jul 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with
  Feature Mapping
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature MappingInternational Conference on Machine Learning (ICML), 2020
Dongruo Zhou
Jiafan He
Quanquan Gu
301
141
0
23 Jun 2020
Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation
Sample Efficient Reinforcement Learning via Low-Rank Matrix EstimationNeural Information Processing Systems (NeurIPS), 2020
Devavrat Shah
Dogyoon Song
Zhi Xu
Yuzhe Yang
275
33
0
11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A
  Mean-Field Theory
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OODMLT
663
11
0
08 Jun 2020
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
  Variance Reduction
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
OffRL
380
126
0
04 Jun 2020
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation
  under Markovian Noise
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise
Yue Wang
Shaofeng Zou
142
22
0
20 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
244
63
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
A Finite Time Analysis of Two Time-Scale Actor Critic MethodsNeural Information Processing Systems (NeurIPS), 2020
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
334
167
0
04 May 2020
Actor-Critic Reinforcement Learning for Control with Stability Guarantee
Actor-Critic Reinforcement Learning for Control with Stability GuaranteeIEEE Robotics and Automation Letters (RA-L), 2020
Minghao Han
Lixian Zhang
Jun Wang
Wei Pan
261
136
0
29 Apr 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
311
25
0
27 Apr 2020
Agnostic Q-learning with Function Approximation in Deterministic
  Systems: Tight Bounds on Approximation Error and Sample Complexity
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity
S. Du
Jason D. Lee
G. Mahajan
Ruosong Wang
110
39
0
17 Feb 2020
Non-asymptotic Convergence of Adam-type Reinforcement Learning
  Algorithms under Markovian Sampling
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian SamplingAAAI Conference on Artificial Intelligence (AAAI), 2020
Huaqing Xiong
Tengyu Xu
Yingbin Liang
Wei Zhang
171
35
0
15 Feb 2020
Does the Markov Decision Process Fit the Data: Testing for the Markov
  Property in Sequential Decision Making
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision MakingInternational Conference on Machine Learning (ICML), 2020
C. Shi
Runzhe Wan
R. Song
Wenbin Lu
Ling Leng
162
41
0
05 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
  Learning Framework
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning FrameworkJournal of the American Statistical Association (JASA), 2020
C. Shi
Xiaoyu Wang
Shuang Luo
Hongtu Zhu
Jieping Ye
R. Song
CMLOffRL
442
48
0
05 Feb 2020
Reanalysis of Variance Reduced Temporal Difference Learning
Reanalysis of Variance Reduced Temporal Difference LearningInternational Conference on Learning Representations (ICLR), 2020
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
OffRL
321
44
0
07 Jan 2020
Scalable Reinforcement Learning for Multi-Agent Networked Systems
Scalable Reinforcement Learning for Multi-Agent Networked SystemsOperational Research (OR), 2019
Guannan Qu
Adam Wierman
Na Li
244
43
0
05 Dec 2019
A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning
  Algorithms
A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms
Dong-hwan Lee
Niao He
332
28
0
04 Dec 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and
  Algorithms
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Jianchao Tan
Zhuoran Yang
Tamer Basar
527
1,440
0
24 Nov 2019
Generalized Speedy Q-learning
Generalized Speedy Q-learningIEEE Control Systems Letters (L-CSS), 2019
I. John
Chandramouli Kamanchi
S. Bhatnagar
119
20
0
01 Nov 2019
On the Sample Complexity of Actor-Critic Method for Reinforcement
  Learning with Function Approximation
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function ApproximationMachine-mediated learning (ML), 2019
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
301
95
0
18 Oct 2019
Previous
123
Next