Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.02234
Cited By
Finite-Sample Analysis for SARSA with Linear Function Approximation
6 February 2019
Shaofeng Zou
Tengyu Xu
Yingbin Liang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Finite-Sample Analysis for SARSA with Linear Function Approximation"
50 / 95 papers shown
Title
Natural Policy Gradient for Average Reward Non-Stationary RL
Neharika Jali
Eshika Pathak
Pranay Sharma
Guannan Qu
Gauri Joshi
32
0
0
23 Apr 2025
A Hybrid Reinforcement Learning Framework for Hard Latency Constrained Resource Scheduling
Luyuan Zhang
An Liu
Kexuan Wang
34
0
0
30 Mar 2025
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Ruijia Zhang
Siliang Zeng
Chenliang Li
Alfredo García
Mingyi Hong
67
0
0
22 Mar 2025
Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with a Generative Model
Zilong Deng
Simon Khan
Shaofeng Zou
59
0
0
11 Mar 2025
Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
Chenyu Zhang
Xu Chen
Xuan Di
87
4
0
17 Feb 2025
Heavy-Ball Momentum Accelerated Actor-Critic With Function Approximation
Yanjie Dong
Haijun Zhang
Gang Wang
Shisheng Cui
Xiping Hu
53
1
0
13 Aug 2024
Finite-Time Analysis of Simultaneous Double Q-learning
Hyunjun Na
Donghwan Lee
29
0
0
14 Jun 2024
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
Shuai Zhang
Heshan Devaka Fernando
Miao Liu
K. Murugesan
Songtao Lu
Pin-Yu Chen
Tianyi Chen
Meng Wang
54
1
0
24 May 2024
Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning
Sihan Zeng
Thinh T. Doan
54
5
0
15 May 2024
Graphon Mean Field Games with a Representative Player: Analysis and Learning Algorithm
Fuzhong Zhou
Chenyu Zhang
Xu Chen
Xuan Di
33
2
0
08 May 2024
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke
Zaiwen Wen
Junyu Zhang
37
0
0
07 May 2024
A Single Online Agent Can Efficiently Learn Mean Field Games
Chenyu Zhang
Xu Chen
Xuan Di
OffRL
47
2
0
05 May 2024
Enhancing Classification Performance via Reinforcement Learning for Feature Selection
Younes Ghazagh Jahed
Seyyed Ali Sadat Tavana
37
2
0
09 Mar 2024
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning
Chenyu Zhang
Han Wang
Aritra Mitra
James Anderson
34
18
0
27 Jan 2024
Neural Network Approximation for Pessimistic Offline Reinforcement Learning
Di Wu
Yuling Jiao
Li Shen
Haizhao Yang
Xiliang Lu
OffRL
29
1
0
19 Dec 2023
Lifting the Veil: Unlocking the Power of Depth in Q-learning
Shao-Bo Lin
Tao Li
Shaojie Tang
Yao Wang
Ding-Xuan Zhou
OffRL
OOD
17
0
0
27 Oct 2023
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with
ε
ε
ε
-Greedy Exploration
Shuai Zhang
Hongkang Li
Meng Wang
Miao Liu
Pin-Yu Chen
Songtao Lu
Sijia Liu
K. Murugesan
Subhajit Chaudhury
40
19
0
24 Oct 2023
Suppressing Overestimation in Q-Learning through Adversarial Behaviors
HyeAnn Lee
Donghwan Lee
15
0
0
10 Oct 2023
Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation
Guojun Xiong
Jian Li
30
13
0
03 Oct 2023
TD Convergence: An Optimization Perspective
Kavosh Asadi
Shoham Sabach
Yao Liu
Omer Gottesman
Rasool Fakoor
MU
17
8
0
30 Jun 2023
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Hang Wang
Sen Lin
Junshan Zhang
OffRL
OnRL
33
3
0
20 Jun 2023
A Single-Loop Deep Actor-Critic Algorithm for Constrained Reinforcement Learning with Provable Convergence
Kexuan Wang
An Liu
Baishuo Liu
20
1
0
10 Jun 2023
A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games
Zaiwei Chen
Kaipeng Zhang
Eric Mazumdar
Asuman Ozdaglar
Adam Wierman
54
6
0
03 Mar 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
24
0
0
25 Feb 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Wesley A Suttle
Amrit Singh Bedi
Bhrij Patel
Brian M Sadler
Alec Koppel
Dinesh Manocha
29
14
0
28 Jan 2023
A Policy Optimization Method Towards Optimal-time Stability
Shengjie Wang
Lan Fengb
Xiang Zheng
Yu-wen Cao
Oluwatosin Oseni
Haotian Xu
Tao Zhang
Yang Gao
39
1
0
02 Jan 2023
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu-Xiang Wang
William Yang Wang
OffRL
39
15
0
29 Nov 2022
Finite-time analysis of single-timescale actor-critic
Xu-yang Chen
Lin Zhao
OffRL
29
21
0
18 Oct 2022
Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees
Siliang Zeng
Chenliang Li
Alfredo García
Min-Fong Hong
34
42
0
04 Oct 2022
Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees
Siliang Zeng
Mingyi Hong
Alfredo García
OffRL
33
12
0
04 Oct 2022
Finite-Time Error Bounds for Greedy-GQ
Yue Wang
Yi Zhou
Shaofeng Zou
31
1
0
06 Sep 2022
Robust Knowledge Adaptation for Dynamic Graph Neural Networks
Han Li
Changsheng Li
Kaituo Feng
Ye Yuan
Guoren Wang
H. Zha
34
13
0
22 Jul 2022
q-Learning in Continuous Time
Yanwei Jia
X. Zhou
OffRL
51
68
0
02 Jul 2022
Analysis of Stochastic Processes through Replay Buffers
Shirli Di-Castro Shashua
Shie Mannor
Dotan Di-Castro
36
6
0
26 Jun 2022
A Single-Timescale Analysis For Stochastic Approximation With Multiple Coupled Sequences
Han Shen
Tianyi Chen
45
15
0
21 Jun 2022
Algorithm for Constrained Markov Decision Process with Linear Convergence
E. Gladin
Maksim Lavrik-Karmazin
K. Zainullina
Varvara Rudenko
Alexander V. Gasnikov
Martin Takáč
33
6
0
03 Jun 2022
Finite-Time Analysis of Temporal Difference Learning: Discrete-Time Linear System Perspective
Dong-hwan Lee
Do Wan Kim
OffRL
16
0
0
22 Apr 2022
Data Sampling Affects the Complexity of Online SGD over Dependent Data
Shaocong Ma
Ziyi Chen
Yi Zhou
Kaiyi Ji
Yingbin Liang
11
5
0
31 Mar 2022
Target Network and Truncation Overcome The Deadly Triad in
Q
Q
Q
-Learning
Zaiwei Chen
John-Paul Clarke
S. T. Maguluri
18
19
0
05 Mar 2022
Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons
C. Shi
Shuang Luo
Yuan Le
Hongtu Zhu
R. Song
OffRL
OnRL
32
10
0
26 Feb 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C. Shi
Runzhe Wan
Ge Song
Shuang Luo
R. Song
Hongtu Zhu
OffRL
41
6
0
21 Feb 2022
Stochastic linear optimization never overfits with quadratically-bounded losses on general data
Matus Telgarsky
19
11
0
14 Feb 2022
On the Convergence of SARSA with Linear Function Approximation
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
11
10
0
14 Feb 2022
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
29
167
0
08 Dec 2021
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation
Yixuan Lin
V. Gupta
Ji Liu
32
3
0
24 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
30
10
0
04 Nov 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
16
5
0
30 Oct 2021
Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic Algorithm for Constrained Markov Decision Processes
Sihan Zeng
Thinh T. Doan
Justin Romberg
102
17
0
21 Oct 2021
Actor-critic is implicitly biased towards high entropy optimal policies
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
60
11
0
21 Oct 2021
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process
Tianjiao Li
Ziwei Guan
Shaofeng Zou
Tengyu Xu
Yingbin Liang
Guanghui Lan
29
26
0
20 Oct 2021
1
2
Next