Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.02997
Cited By
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
4 November 2021
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch"
11 / 11 papers shown
Title
A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games
Zaiwei Chen
Kaipeng Zhang
Eric Mazumdar
Asuman Ozdaglar
Adam Wierman
48
6
0
03 Mar 2023
Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning
Yizhou Zhang
Guannan Qu
Pan Xu
Yiheng Lin
Zaiwei Chen
Adam Wierman
34
25
0
30 Nov 2022
Robust Constrained Reinforcement Learning
Yue Wang
Fei Miao
Shaofeng Zou
37
12
0
14 Sep 2022
Policy Gradient Method For Robust Reinforcement Learning
Yue Wang
Shaofeng Zou
81
67
0
15 May 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
40
2
0
15 Feb 2022
On the Convergence of SARSA with Linear Function Approximation
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
11
10
0
14 Feb 2022
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence
Liang Xu
Daoming Lyu
Yangchen Pan
Aiwen Jiang
Bo Liu
28
0
0
24 Jan 2022
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Shangtong Zhang
Shimon Whiteson
OffRL
13
11
0
11 Aug 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
102
79
0
18 Oct 2019
1