Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.02450
Cited By
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
6 June 2018
Jalaj Bhandari
Daniel Russo
Raghav Singal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation"
50 / 223 papers shown
Title
Learning Optimal Admission Control in Partially Observable Queueing Networks
Jonatha Anselmi
B. Gaujal
Louis-Sébastien Rebuffi
34
1
0
04 Aug 2023
Online covariance estimation for stochastic gradient descent under Markovian sampling
Abhishek Roy
Krishnakumar Balasubramanian
24
5
0
03 Aug 2023
Loss Dynamics of Temporal Difference Reinforcement Learning
Blake Bordelon
P. Masset
Henry Kuo
Cengiz Pehlevan
AI4CE
21
0
0
10 Jul 2023
TD Convergence: An Optimization Perspective
Kavosh Asadi
Shoham Sabach
Yao Liu
Omer Gottesman
Rasool Fakoor
MU
20
8
0
30 Jun 2023
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
23
2
0
20 Jun 2023
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Hang Wang
Sen Lin
Junshan Zhang
OffRL
OnRL
33
3
0
20 Jun 2023
Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach
Dong-hwan Lee
21
2
0
09 Jun 2023
High-probability sample complexities for policy evaluation with linear function approximation
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
30
7
0
30 May 2023
First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
Aleksandr Beznosikov
S. Samsonov
Marina Sheshukova
Alexander Gasnikov
A. Naumov
Eric Moulines
52
14
0
25 May 2023
Adaptive Policy Learning to Additional Tasks
Wenjian Hao
Zehui Lu
Zihao Liang
Tianyu Zhou
Shaoshuai Mou
32
0
0
24 May 2023
Optimal Control of Nonlinear Systems with Unknown Dynamics
Wenjian Hao
Paulo Heredia
Bowen Huang
42
1
0
24 May 2023
Federated TD Learning over Finite-Rate Erasure Channels: Linear Speedup under Markovian Sampling
Nicolò Dal Fabbro
A. Mitra
George J. Pappas
FedML
35
12
0
14 May 2023
Quantile-Based Deep Reinforcement Learning using Two-Timescale Policy Gradient Algorithms
Jinyang Jiang
Jiaqiao Hu
Yijie Peng
18
2
0
12 May 2023
Streaming PCA for Markovian Data
Syamantak Kumar
Purnamrita Sarkar
50
6
0
03 May 2023
Concentration of Contractive Stochastic Approximation: Additive and Multiplicative Noise
Zaiwei Chen
S. T. Maguluri
Martin Zubeldia
24
7
0
28 Mar 2023
n-Step Temporal Difference Learning with Optimal n
Lakshmi Mandal
S. Bhatnagar
29
2
0
13 Mar 2023
FaaSched: A Jitter-Aware Serverless Scheduler
Abhisek Panda
S. Sarangi
39
0
0
11 Mar 2023
Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games
Zhaoyi Zhou
Zaiwei Chen
Yiheng Lin
Adam Wierman
46
7
0
08 Mar 2023
Policy Mirror Descent Inherently Explores Action Space
Yan Li
Guanghui Lan
OffRL
58
8
0
08 Mar 2023
A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games
Zaiwei Chen
Kaipeng Zhang
Eric Mazumdar
Asuman Ozdaglar
Adam Wierman
54
6
0
03 Mar 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
24
0
0
25 Feb 2023
Why Target Networks Stabilise Temporal Difference Methods
Matt Fellows
Matthew Smith
Shimon Whiteson
OOD
AAML
21
7
0
24 Feb 2023
Stochastic Approximation Beyond Gradient for Signal Processing and Machine Learning
Aymeric Dieuleveut
G. Fort
Eric Moulines
Hoi-To Wai
59
12
0
22 Feb 2023
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning
Siliang Zeng
Chenliang Li
Alfredo García
Min-Fong Hong
OffRL
34
13
0
15 Feb 2023
Federated Temporal Difference Learning with Linear Function Approximation under Environmental Heterogeneity
Han Wang
A. Mitra
Hamed Hassani
George J. Pappas
James Anderson
FedML
31
21
0
04 Feb 2023
On the Statistical Benefits of Temporal Difference Learning
David Cheikhi
Daniel Russo
13
4
0
30 Jan 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Wesley A Suttle
Amrit Singh Bedi
Bhrij Patel
Brian M Sadler
Alec Koppel
Dinesh Manocha
29
14
0
28 Jan 2023
On The Convergence Of Policy Iteration-Based Reinforcement Learning With Monte Carlo Policy Evaluation
Anna Winnicki
R. Srikant
14
9
0
23 Jan 2023
Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization
C. Shi
Zhengling Qi
Jianing Wang
Fan Zhou
OffRL
33
4
0
05 Jan 2023
Temporal Difference Learning with Compressed Updates: Error-Feedback meets Reinforcement Learning
A. Mitra
George J. Pappas
Hamed Hassani
31
12
0
03 Jan 2023
Closing the gap between SVRG and TD-SVRG with Gradient Splitting
Arsenii Mustafin
Alexander Olshevsky
I. Paschalidis
19
1
0
29 Nov 2022
Krylov-Bellman boosting: Super-linear policy evaluation in general state spaces
Eric Xia
Martin J. Wainwright
OffRL
19
2
0
20 Oct 2022
Finite-time analysis of single-timescale actor-critic
Xu-yang Chen
Lin Zhao
OffRL
29
21
0
18 Oct 2022
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation
Gandharv Patil
Prashanth L.A.
Dheeraj M. Nagaraj
Doina Precup
11
15
0
12 Oct 2022
Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees
Siliang Zeng
Chenliang Li
Alfredo García
Min-Fong Hong
34
42
0
04 Oct 2022
Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees
Siliang Zeng
Mingyi Hong
Alfredo García
OffRL
33
12
0
04 Oct 2022
Bias and Extrapolation in Markovian Linear Stochastic Approximation with Constant Stepsizes
D. Huo
Yudong Chen
Qiaomin Xie
26
17
0
03 Oct 2022
Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization
Carlo Alfano
Patrick Rebeschini
57
13
0
30 Sep 2022
Finite-Time Error Bounds for Greedy-GQ
Yue Wang
Yi Zhou
Shaofeng Zou
34
1
0
06 Sep 2022
Global Convergence of Two-timescale Actor-Critic for Solving Linear Quadratic Regulator
Xu-yang Chen
Jingliang Duan
Yingbin Liang
Lin Zhao
32
6
0
18 Aug 2022
An Approximate Policy Iteration Viewpoint of Actor-Critic Algorithms
Zaiwei Chen
S. T. Maguluri
28
0
0
05 Aug 2022
Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View
Han-Dong Lim
Dong-hwan Lee
30
1
0
25 Jul 2022
Actor-Critic based Improper Reinforcement Learning
Mohammadi Zaki
Avinash Mohan
Aditya Gopalan
Shie Mannor
21
2
0
19 Jul 2022
Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation
Alain Durmus
Eric Moulines
A. Naumov
S. Samsonov
29
24
0
10 Jul 2022
Constrained Stochastic Nonconvex Optimization with State-dependent Markov Data
Abhishek Roy
Krishnakumar Balasubramanian
Saeed Ghadimi
16
9
0
22 Jun 2022
A Single-Timescale Analysis For Stochastic Approximation With Multiple Coupled Sequences
Han Shen
Tianyi Chen
47
15
0
21 Jun 2022
Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic
Qijun Luo
Xiao Li
32
1
0
12 Jun 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
38
5
0
01 Jun 2022
Policy Gradient Method For Robust Reinforcement Learning
Yue Wang
Shaofeng Zou
81
69
0
15 May 2022
Stochastic first-order methods for average-reward Markov decision processes
Tianjiao Li
Feiyang Wu
Guanghui Lan
27
14
0
11 May 2022
Previous
1
2
3
4
5
Next