ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.02450
  4. Cited By
A Finite Time Analysis of Temporal Difference Learning With Linear
  Function Approximation

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation

6 June 2018
Jalaj Bhandari
Daniel Russo
Raghav Singal
ArXivPDFHTML

Papers citing "A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation"

50 / 223 papers shown
Title
Finite-Time Analysis of Temporal Difference Learning: Discrete-Time
  Linear System Perspective
Finite-Time Analysis of Temporal Difference Learning: Discrete-Time Linear System Perspective
Dong-hwan Lee
Do Wan Kim
OffRL
30
0
0
22 Apr 2022
Exact Formulas for Finite-Time Estimation Errors of Decentralized
  Temporal Difference Learning with Linear Function Approximation
Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation
Xing-ming Guo
Bin Hu
13
2
0
20 Apr 2022
Convergence of First-Order Methods for Constrained Nonconvex
  Optimization with Dependent Data
Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data
Ahmet Alacaoglu
Hanbaek Lyu
19
4
0
29 Mar 2022
A Complete Characterization of Linear Estimators for Offline Policy
  Evaluation
A Complete Characterization of Linear Estimators for Offline Policy Evaluation
Juan C. Perdomo
A. Krishnamurthy
Peter L. Bartlett
Sham Kakade
OffRL
27
3
0
08 Mar 2022
A Small Gain Analysis of Single Timescale Actor Critic
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
33
20
0
04 Mar 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation
  in Two-sided Markets
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C. Shi
Runzhe Wan
Ge Song
Shuang Luo
R. Song
Hongtu Zhu
OffRL
41
6
0
21 Feb 2022
Finite-Time Analysis of Natural Actor-Critic for POMDPs
Finite-Time Analysis of Natural Actor-Critic for POMDPs
Semih Cayci
Niao He
R. Srikant
33
1
0
20 Feb 2022
Convex Programs and Lyapunov Functions for Reinforcement Learning: A
  Unified Perspective on the Analysis of Value-Based Methods
Convex Programs and Lyapunov Functions for Reinforcement Learning: A Unified Perspective on the Analysis of Value-Based Methods
Xing-ming Guo
Bin Hu
OffRL
30
3
0
14 Feb 2022
Stochastic linear optimization never overfits with quadratically-bounded
  losses on general data
Stochastic linear optimization never overfits with quadratically-bounded losses on general data
Matus Telgarsky
19
11
0
14 Feb 2022
On the Convergence of SARSA with Linear Function Approximation
On the Convergence of SARSA with Linear Function Approximation
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
18
10
0
14 Feb 2022
Settling the Communication Complexity for Distributed Offline
  Reinforcement Learning
Settling the Communication Complexity for Distributed Offline Reinforcement Learning
Juliusz Krysztof Ziomek
Jun Wang
Yaodong Yang
OffRL
6
4
0
10 Feb 2022
Adapting to Mixing Time in Stochastic Optimization with Markovian Data
Adapting to Mixing Time in Stochastic Optimization with Markovian Data
Ron Dorfman
Kfir Y. Levy
37
28
0
09 Feb 2022
Stochastic Gradient Descent with Dependent Data for Offline
  Reinforcement Learning
Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning
Jing-rong Dong
Xin T. Tong
OffRL
35
2
0
06 Feb 2022
Differentially Private Temporal Difference Learning with Stochastic
  Nonconvex-Strongly-Concave Optimization
Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization
Canzhe Zhao
Yanjie Ze
Jing Dong
Baoxiang Wang
Shuai Li
52
4
0
25 Jan 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement
  Learning
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
37
5
0
21 Jan 2022
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning
Xiang Li
Wenhao Yang
Jiadong Liang
Zhihua Zhang
Michael I. Jordan
40
15
0
29 Dec 2021
Control Theoretic Analysis of Temporal Difference Learning
Dong-hwan Lee
Do Wan Kim
24
1
0
29 Dec 2021
Accelerated and instance-optimal policy evaluation with linear function
  approximation
Accelerated and instance-optimal policy evaluation with linear function approximation
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
37
13
0
24 Dec 2021
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
29
167
0
08 Dec 2021
Robust and Adaptive Temporal-Difference Learning Using An Ensemble of
  Gaussian Processes
Robust and Adaptive Temporal-Difference Learning Using An Ensemble of Gaussian Processes
Qin Lu
G. Giannakis
GP
OffRL
11
4
0
01 Dec 2021
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation
Yixuan Lin
V. Gupta
Ji Liu
32
3
0
24 Nov 2021
Stationary Behavior of Constant Stepsize SGD Type Algorithms: An
  Asymptotic Characterization
Stationary Behavior of Constant Stepsize SGD Type Algorithms: An Asymptotic Characterization
Zaiwei Chen
Shancong Mou
S. T. Maguluri
17
13
0
11 Nov 2021
A Concentration Bound for LSPE($λ$)
A Concentration Bound for LSPE(λλλ)
Siddharth Chandak
Vivek Borkar
H. Dolhare
35
0
0
04 Nov 2021
Actor-critic is implicitly biased towards high entropy optimal policies
Actor-critic is implicitly biased towards high entropy optimal policies
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
62
11
0
21 Oct 2021
Online Target Q-learning with Reverse Experience Replay: Efficiently
  finding the Optimal Policy for Linear MDPs
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs
Naman Agarwal
Syomantak Chaudhuri
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
OffRL
40
21
0
16 Oct 2021
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning
  Method
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method
Ziwei Guan
Tengyu Xu
Yingbin Liang
13
4
0
13 Oct 2021
Learning to Coordinate in Multi-Agent Systems: A Coordinated
  Actor-Critic Algorithm and Finite-Time Guarantees
Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees
Siliang Zeng
Tianyi Chen
Alfredo García
Mingyi Hong
42
11
0
11 Oct 2021
Online Robust Reinforcement Learning with Model Uncertainty
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
97
0
29 Sep 2021
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms
  with Finite-Time Analysis
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
Ziyi Chen
Yi Zhou
Rongrong Chen
Shaofeng Zou
19
24
0
08 Sep 2021
Truncated Emphatic Temporal Difference Methods for Prediction and
  Control
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Shangtong Zhang
Shimon Whiteson
OffRL
17
11
0
11 Aug 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement
  Learning
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
50
27
0
08 Aug 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network
  Approach
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
42
36
0
05 Aug 2021
A Unified Off-Policy Evaluation Approach for General Value Function
A Unified Off-Policy Evaluation Approach for General Value Function
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
18
2
0
06 Jul 2021
Instance-optimality in optimal value estimation: Adaptivity via
  variance-reduced Q-learning
Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
OffRL
20
19
0
28 Jun 2021
Concentration of Contractive Stochastic Approximation and Reinforcement
  Learning
Concentration of Contractive Stochastic Approximation and Reinforcement Learning
Siddharth Chandak
Vivek Borkar
Parth Dodhia
43
17
0
27 Jun 2021
Tighter Analysis of Alternating Stochastic Gradient Method for
  Stochastic Nested Problems
Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems
Tianyi Chen
Yuejiao Sun
W. Yin
26
33
0
25 Jun 2021
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman
  Operators
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
OffRL
31
10
0
24 Jun 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function
  Approximation
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
32
9
0
14 Jun 2021
Linear Convergence of Entropy-Regularized Natural Policy Gradient with
  Linear Function Approximation
Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation
Semih Cayci
Niao He
R. Srikant
35
35
0
08 Jun 2021
Tight High Probability Bounds for Linear Stochastic Approximation with
  Fixed Stepsize
Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize
Alain Durmus
Eric Moulines
A. Naumov
S. Samsonov
Kevin Scaman
Hoi-To Wai
27
20
0
02 Jun 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
63
29
0
26 May 2021
Deeply-Debiased Off-Policy Interval Estimation
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
25
36
0
10 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
40
56
0
04 May 2021
Distributed TD(0) with Almost No Communication
Distributed TD(0) with Almost No Communication
R. Liu
Alexander Olshevsky
FedML
28
15
0
16 Apr 2021
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
C. Bowyer
24
1
0
15 Apr 2021
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth
  Function Approximation
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
Yue Wang
Shaofeng Zou
Yi Zhou
14
11
0
07 Apr 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with
  Near-Optimal Sample Complexity and Communication Complexity
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity
Ziyi Chen
Yi Zhou
Rongrong Chen
OffRL
24
7
0
24 Mar 2021
Sample Complexity and Overparameterization Bounds for Temporal
  Difference Learning with Neural Network Approximation
Sample Complexity and Overparameterization Bounds for Temporal Difference Learning with Neural Network Approximation
Semih Cayci
Siddhartha Satpathi
Niao He
F. I. R. Srikant
29
9
0
02 Mar 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
48
75
0
12 Feb 2021
Previous
12345
Next