Finite time analysis of temporal difference learning with linear
function approximation: Tail averaging and regularisation

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation

12 October 2022

Dheeraj M. Nagaraj

Papers citing "Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation"

11 / 11 papers shown

Title
Convergence of TD(0) under Polynomial Mixing with Nonlinear Function Approximation Anupama Sridhar Alexander Johansen 40 0 0 08 Feb 2025
A Finite-Sample Analysis of an Actor-Critic Algorithm for Mean-Variance Optimization in a Discounted MDP Tejaram Sangadi L. A. Prashanth Krishna Jagannathan 15 0 0 12 Jun 2024
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning S. Samsonov Eric Moulines Qi-Man Shao Zhuo-Song Zhang Alexey Naumov 33 4 0 26 May 2024
SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning Paul Mangold S. Samsonov Safwan Labbi I. Levin Réda Alami Alexey Naumov Eric Moulines 38 1 0 06 Feb 2024
Central Limit Theorem for Two-Timescale Stochastic Approximation with Markovian Noise: Theory and Applications Jie Hu Vishwaraj Doshi Do Young Eun 38 4 0 17 Jan 2024
A Concentration Bound for TD(0) with Function Approximation Siddharth Chandak Vivek Borkar 37 0 0 16 Dec 2023
Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability S. Samsonov D. Tiapkin Alexey Naumov Eric Moulines 34 5 0 22 Oct 2023
Loss Dynamics of Temporal Difference Reinforcement Learning Blake Bordelon P. Masset Henry Kuo Cengiz Pehlevan AI4CE 23 0 0 10 Jul 2023
Federated TD Learning over Finite-Rate Erasure Channels: Linear Speedup under Markovian Sampling Nicolò Dal Fabbro A. Mitra George J. Pappas FedML 37 12 0 14 May 2023
The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning Vivek Borkar Shuhang Chen Adithya M. Devraj Ioannis Kontoyiannis Sean P. Meyn 24 31 0 27 Oct 2021
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs Naman Agarwal Syomantak Chaudhuri Prateek Jain Dheeraj M. Nagaraj Praneeth Netrapalli OffRL 40 21 0 16 Oct 2021