ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.01898
  4. Cited By
Reanalysis of Variance Reduced Temporal Difference Learning
v1v2 (latest)

Reanalysis of Variance Reduced Temporal Difference Learning

International Conference on Learning Representations (ICLR), 2020
7 January 2020
Tengyu Xu
Zhe Wang
Yi Zhou
Yingbin Liang
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Reanalysis of Variance Reduced Temporal Difference Learning"

28 / 28 papers shown
Title
FIRM: Federated In-client Regularized Multi-objective Alignment for Large Language Models
FIRM: Federated In-client Regularized Multi-objective Alignment for Large Language Models
Fatemeh
Nourzad
Amirhossein Roknilamouki
Eylem Ekici
Ness B. Shroff
FedML
186
0
0
21 Nov 2025
Q-Learning with Fine-Grained Gap-Dependent Regret
Q-Learning with Fine-Grained Gap-Dependent Regret
Haochen Zhang
Zhong Zheng
Lingzhou Xue
68
0
0
08 Oct 2025
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
Jiabin Fan
Guoqing Luo
Michael Bowling
Lili Mou
OffRL
351
0
0
26 Apr 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage DecompositionInternational Conference on Learning Representations (ICLR), 2024
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
330
8
0
10 Oct 2024
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs
  with Short Burn-In Time
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In TimeNeural Information Processing Systems (NeurIPS), 2023
Xiang Ji
Gen Li
OffRL
263
7
0
24 May 2023
n-Step Temporal Difference Learning with Optimal n
n-Step Temporal Difference Learning with Optimal n
Lakshmi Mandal
S. Bhatnagar
219
2
0
13 Mar 2023
Closing the gap between SVRG and TD-SVRG with Gradient Splitting
Closing the gap between SVRG and TD-SVRG with Gradient Splitting
Arsenii Mustafin
Alexander Olshevsky
I. Paschalidis
125
2
0
29 Nov 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
The Efficacy of Pessimism in Asynchronous Q-LearningIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2022
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
234
43
0
14 Mar 2022
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards
  Optimal Sample Complexity
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample ComplexityInternational Conference on Machine Learning (ICML), 2022
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
242
101
0
28 Feb 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement
  Learning
Instance-Dependent Confidence and Early Stopping for Reinforcement LearningJournal of machine learning research (JMLR), 2022
K. Khamaru
Eric Xia
Martin J. Wainwright
Sai Li
152
6
0
21 Jan 2022
Accelerated and instance-optimal policy evaluation with linear function
  approximation
Accelerated and instance-optimal policy evaluation with linear function approximationSIAM Journal on Mathematics of Data Science (SIMODS), 2021
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
183
15
0
24 Dec 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
288
62
0
09 Oct 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement
  Learning
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
252
38
0
08 Aug 2021
Tighter Analysis of Alternating Stochastic Gradient Method for
  Stochastic Nested Problems
Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems
Tianyi Chen
Yuejiao Sun
W. Yin
152
36
0
25 Jun 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved
  Complexity
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved ComplexityInternational Conference on Learning Representations (ICLR), 2021
Shaocong Ma
Ziyi Chen
Yi Zhou
Shaofeng Zou
156
12
0
30 Mar 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with
  Near-Optimal Sample Complexity and Communication Complexity
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity
Ziyi Chen
Yi Zhou
Rongrong Chen
OffRL
159
7
0
24 Mar 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and OptimalityInternational Conference on Machine Learning (ICML), 2021
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
255
28
0
23 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Is Q-Learning Minimax Optimal? A Tight Sample Complexity AnalysisOperational Research (OR), 2021
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
258
85
0
12 Feb 2021
Sample Complexity Bounds for Two Timescale Value-based Reinforcement
  Learning Algorithms
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms
Tengyu Xu
Yingbin Liang
216
27
0
10 Nov 2020
Temporal Difference Learning as Gradient Splitting
Temporal Difference Learning as Gradient SplittingInternational Conference on Machine Learning (ICML), 2020
Rui Liu
Alexander Olshevsky
140
16
0
27 Oct 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence
  Analysis
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence AnalysisNeural Information Processing Systems (NeurIPS), 2020
Shaocong Ma
Yi Zhou
Shaofeng Zou
OffRL
266
17
0
26 Oct 2020
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
  Nonlinear TD Learning
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Delin Qu
Zhuoran Yang
Xiaohan Wei
Jieping Ye
Zhaoran Wang
380
38
0
23 Aug 2020
When Will Generative Adversarial Imitation Learning Algorithms Attain
  Global Convergence
When Will Generative Adversarial Imitation Learning Algorithms Attain Global ConvergenceInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Ziwei Guan
Tengyu Xu
Yingbin Liang
193
17
0
24 Jun 2020
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
  Variance Reduction
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
OffRL
372
125
0
04 Jun 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
224
63
0
07 May 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
239
25
0
27 Apr 2020
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
Is Temporal Difference Learning Optimal? An Instance-Dependent AnalysisSIAM Journal on Mathematics of Data Science (SIMODS), 2020
K. Khamaru
A. Pananjady
Feng Ruan
Martin J. Wainwright
Sai Li
OffRL
155
51
0
16 Mar 2020
Finite-Sample Analysis of Decentralized Temporal-Difference Learning
  with Linear Function Approximation
Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation
Jun Sun
Gang Wang
G. Giannakis
Qinmin Yang
Zaiyue Yang
OffRL
199
21
0
03 Nov 2019
1