Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch

4 November 2021

Papers citing "Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch"

11 / 11 papers shown

Title
A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games Zaiwei Chen Kaipeng Zhang Eric Mazumdar Asuman Ozdaglar Adam Wierman 48 6 0 03 Mar 2023
Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning Yizhou Zhang Guannan Qu Pan Xu Yiheng Lin Zaiwei Chen Adam Wierman 34 25 0 30 Nov 2022
Robust Constrained Reinforcement Learning Yue Wang Fei Miao Shaofeng Zou 37 12 0 14 Sep 2022
Policy Gradient Method For Robust Reinforcement Learning Yue Wang Shaofeng Zou 81 67 0 15 May 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms Romain Laroche Rémi Tachet des Combes 40 2 0 15 Feb 2022
On the Convergence of SARSA with Linear Function Approximation Shangtong Zhang Rémi Tachet des Combes Romain Laroche 11 10 0 14 Feb 2022
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence Liang Xu Daoming Lyu Yangchen Pan Aiwen Jiang Bo Liu 28 0 0 24 Jan 2022
Truncated Emphatic Temporal Difference Methods for Prediction and Control Shangtong Zhang Shimon Whiteson OffRL 13 11 0 11 Aug 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm S. Khodadadian Zaiwei Chen S. T. Maguluri CML OffRL 71 26 0 18 Feb 2021
A Finite Time Analysis of Two Time-Scale Actor Critic Methods Yue Wu Weitong Zhang Pan Xu Quanquan Gu 90 146 0 04 May 2020
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation Harshat Kumar Alec Koppel Alejandro Ribeiro 102 79 0 18 Oct 2019