Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.02450
Cited By
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
6 June 2018
Jalaj Bhandari
Daniel Russo
Raghav Singal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation"
50 / 223 papers shown
Title
Finite-Time Analysis of Temporal Difference Learning: Discrete-Time Linear System Perspective
Dong-hwan Lee
Do Wan Kim
OffRL
30
0
0
22 Apr 2022
Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation
Xing-ming Guo
Bin Hu
13
2
0
20 Apr 2022
Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data
Ahmet Alacaoglu
Hanbaek Lyu
19
4
0
29 Mar 2022
A Complete Characterization of Linear Estimators for Offline Policy Evaluation
Juan C. Perdomo
A. Krishnamurthy
Peter L. Bartlett
Sham Kakade
OffRL
27
3
0
08 Mar 2022
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
33
20
0
04 Mar 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C. Shi
Runzhe Wan
Ge Song
Shuang Luo
R. Song
Hongtu Zhu
OffRL
41
6
0
21 Feb 2022
Finite-Time Analysis of Natural Actor-Critic for POMDPs
Semih Cayci
Niao He
R. Srikant
33
1
0
20 Feb 2022
Convex Programs and Lyapunov Functions for Reinforcement Learning: A Unified Perspective on the Analysis of Value-Based Methods
Xing-ming Guo
Bin Hu
OffRL
30
3
0
14 Feb 2022
Stochastic linear optimization never overfits with quadratically-bounded losses on general data
Matus Telgarsky
19
11
0
14 Feb 2022
On the Convergence of SARSA with Linear Function Approximation
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
18
10
0
14 Feb 2022
Settling the Communication Complexity for Distributed Offline Reinforcement Learning
Juliusz Krysztof Ziomek
Jun Wang
Yaodong Yang
OffRL
6
4
0
10 Feb 2022
Adapting to Mixing Time in Stochastic Optimization with Markovian Data
Ron Dorfman
Kfir Y. Levy
37
28
0
09 Feb 2022
Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning
Jing-rong Dong
Xin T. Tong
OffRL
35
2
0
06 Feb 2022
Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization
Canzhe Zhao
Yanjie Ze
Jing Dong
Baoxiang Wang
Shuai Li
52
4
0
25 Jan 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
37
5
0
21 Jan 2022
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning
Xiang Li
Wenhao Yang
Jiadong Liang
Zhihua Zhang
Michael I. Jordan
40
15
0
29 Dec 2021
Control Theoretic Analysis of Temporal Difference Learning
Dong-hwan Lee
Do Wan Kim
24
1
0
29 Dec 2021
Accelerated and instance-optimal policy evaluation with linear function approximation
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
37
13
0
24 Dec 2021
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
29
167
0
08 Dec 2021
Robust and Adaptive Temporal-Difference Learning Using An Ensemble of Gaussian Processes
Qin Lu
G. Giannakis
GP
OffRL
11
4
0
01 Dec 2021
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation
Yixuan Lin
V. Gupta
Ji Liu
32
3
0
24 Nov 2021
Stationary Behavior of Constant Stepsize SGD Type Algorithms: An Asymptotic Characterization
Zaiwei Chen
Shancong Mou
S. T. Maguluri
17
13
0
11 Nov 2021
A Concentration Bound for LSPE(
λ
λ
λ
)
Siddharth Chandak
Vivek Borkar
H. Dolhare
35
0
0
04 Nov 2021
Actor-critic is implicitly biased towards high entropy optimal policies
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
62
11
0
21 Oct 2021
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs
Naman Agarwal
Syomantak Chaudhuri
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
OffRL
40
21
0
16 Oct 2021
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method
Ziwei Guan
Tengyu Xu
Yingbin Liang
13
4
0
13 Oct 2021
Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees
Siliang Zeng
Tianyi Chen
Alfredo García
Mingyi Hong
42
11
0
11 Oct 2021
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
97
0
29 Sep 2021
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
Ziyi Chen
Yi Zhou
Rongrong Chen
Shaofeng Zou
19
24
0
08 Sep 2021
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Shangtong Zhang
Shimon Whiteson
OffRL
17
11
0
11 Aug 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
50
27
0
08 Aug 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
42
36
0
05 Aug 2021
A Unified Off-Policy Evaluation Approach for General Value Function
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
18
2
0
06 Jul 2021
Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
OffRL
20
19
0
28 Jun 2021
Concentration of Contractive Stochastic Approximation and Reinforcement Learning
Siddharth Chandak
Vivek Borkar
Parth Dodhia
43
17
0
27 Jun 2021
Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems
Tianyi Chen
Yuejiao Sun
W. Yin
26
33
0
25 Jun 2021
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
OffRL
31
10
0
24 Jun 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
32
9
0
14 Jun 2021
Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation
Semih Cayci
Niao He
R. Srikant
35
35
0
08 Jun 2021
Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize
Alain Durmus
Eric Moulines
A. Naumov
S. Samsonov
Kevin Scaman
Hoi-To Wai
27
20
0
02 Jun 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
63
29
0
26 May 2021
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
25
36
0
10 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
40
56
0
04 May 2021
Distributed TD(0) with Almost No Communication
R. Liu
Alexander Olshevsky
FedML
28
15
0
16 Apr 2021
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
C. Bowyer
24
1
0
15 Apr 2021
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
Yue Wang
Shaofeng Zou
Yi Zhou
14
11
0
07 Apr 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity
Ziyi Chen
Yi Zhou
Rongrong Chen
OffRL
24
7
0
24 Mar 2021
Sample Complexity and Overparameterization Bounds for Temporal Difference Learning with Neural Network Approximation
Semih Cayci
Siddhartha Satpathi
Niao He
F. I. R. Srikant
29
9
0
02 Mar 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
48
75
0
12 Feb 2021
Previous
1
2
3
4
5
Next