A Statistical Analysis of Polyak-Ruppert Averaged Q-learning

A Statistical Analysis of Polyak-Ruppert Averaged Q-learning

29 December 2021

Wenhao Yang

Michael I. Jordan

Papers citing "A Statistical Analysis of Polyak-Ruppert Averaged Q-learning"

14 / 14 papers shown

Title
A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile Regression Yixuan Zhang Dongyan Yudong Chen Qiaomin Xie 19 0 0 11 Apr 2025
Asymptotic Time-Uniform Inference for Parameters in Averaged Stochastic Approximation Chuhan Xie Kaicheng Jin Jiadong Liang Zhihua Zhang 16 0 0 19 Oct 2024
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow Chen-Hao Chao Chien Feng Wei-Fang Sun Cheng-Kuang Lee Simon See Chun-Yi Lee 25 1 0 22 May 2024
Efficient Reinforcement Learning for Global Decision Making in the Presence of Local Agents at Scale Emile Anand Guannan Qu 36 5 0 01 Mar 2024
Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning R. Srikant 25 5 0 28 Jan 2024
Constant Stepsize Q-learning: Distributional Convergence, Bias and Extrapolation Yixuan Zhang Qiaomin Xie 11 4 0 25 Jan 2024
Tight Finite Time Bounds of Two-Time-Scale Linear Stochastic Approximation with Markovian Noise Shaan ul Haque S. Khodadadian S. T. Maguluri 37 11 0 31 Dec 2023
Estimation and Inference in Distributional Reinforcement Learning Liangyu Zhang Yang Peng Jiadong Liang Wenhao Yang Zhihua Zhang OffRL 13 1 0 29 Sep 2023
Online covariance estimation for stochastic gradient descent under Markovian sampling Abhishek Roy Krishnakumar Balasubramanian 8 5 0 03 Aug 2023
Functional Central Limit Theorem for Two Timescale Stochastic Approximation Fathima Zarin Faizal Vivek Borkar 6 3 0 09 Jun 2023
Sample Complexity of Variance-reduced Distributionally Robust Q-learning Shengbo Wang Nian Si Jose H. Blanchet Zhengyuan Zhou OOD 8 12 0 28 May 2023
Variance-aware robust reinforcement learning with linear function approximation under heavy-tailed rewards Xiang Li Qiang Sun 11 8 0 09 Mar 2023
Optimal Sample Complexity of Reinforcement Learning for Mixing Discounted Markov Decision Processes Shengbo Wang Jose H. Blanchet Peter Glynn 13 4 0 15 Feb 2023
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes Nathan Kallus Masatoshi Uehara OffRL 31 180 0 22 Aug 2019