ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.01898
  4. Cited By
Reanalysis of Variance Reduced Temporal Difference Learning
v1v2 (latest)

Reanalysis of Variance Reduced Temporal Difference Learning

7 January 2020
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Reanalysis of Variance Reduced Temporal Difference Learning"

19 / 19 papers shown
Title
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
Jiabin Fan
Guoqing Luo
Michael Bowling
Lili Mou
OffRL
137
0
0
26 Apr 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
136
4
0
10 Oct 2024
The Efficacy of Pessimism in Asynchronous Q-Learning
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
144
41
0
14 Mar 2022
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards
  Optimal Sample Complexity
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
99
96
0
28 Feb 2022
Accelerated and instance-optimal policy evaluation with linear function
  approximation
Accelerated and instance-optimal policy evaluation with linear function approximation
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
84
13
0
24 Dec 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
94
54
0
09 Oct 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement
  Learning
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
134
28
0
08 Aug 2021
Tighter Analysis of Alternating Stochastic Gradient Method for
  Stochastic Nested Problems
Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems
Tianyi Chen
Yuejiao Sun
W. Yin
80
33
0
25 Jun 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with
  Near-Optimal Sample Complexity and Communication Complexity
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity
Ziyi Chen
Yi Zhou
Rongrong Chen
OffRL
78
7
0
24 Mar 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
104
25
0
23 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
98
76
0
12 Feb 2021
Temporal Difference Learning as Gradient Splitting
Temporal Difference Learning as Gradient Splitting
Rui Liu
Alexander Olshevsky
54
16
0
27 Oct 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence
  Analysis
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
72
15
0
26 Oct 2020
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
  Nonlinear TD Learning
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Shuang Qiu
Zhuoran Yang
Xiaohan Wei
Jieping Ye
Zhaoran Wang
103
38
0
23 Aug 2020
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
  Variance Reduction
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
OffRL
147
119
0
04 Jun 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
97
58
0
07 May 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
73
25
0
27 Apr 2020
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
K. Khamaru
A. Pananjady
Feng Ruan
Martin J. Wainwright
Michael I. Jordan
OffRL
77
49
0
16 Mar 2020
Finite-Sample Analysis of Decentralized Temporal-Difference Learning
  with Linear Function Approximation
Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation
Jun Sun
Gang Wang
G. Giannakis
Qinmin Yang
Zaiyue Yang
OffRL
81
20
0
03 Nov 2019
1