Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.10306
Cited By
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
25 June 2019
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy"
22 / 22 papers shown
Title
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
18
2
0
20 Jun 2023
Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning
Yuanquan Hu
Xiaoli Wei
Jun Yan
Heng-Wei Zhang
32
8
0
11 Sep 2022
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Shuang Qiu
Xiaohan Wei
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
11
11
0
25 Jul 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
13
24
0
07 Jan 2022
Differentially Private Regret Minimization in Episodic Markov Decision Processes
Sayak Ray Chowdhury
Xingyu Zhou
21
21
0
20 Dec 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
29
111
0
19 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
25
47
0
30 Jul 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
52
29
0
26 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
30
56
0
04 May 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
32
52
0
24 Mar 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
69
26
0
18 Feb 2021
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
34
121
0
11 Nov 2020
Proximal Policy Optimization via Enhanced Exploration Efficiency
Junwei Zhang
Zhenghao Zhang
Shuai Han
Shuai Lu
19
41
0
11 Nov 2020
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
Amal Feriani
E. Hossain
22
236
0
06 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
35
99
0
22 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
15
42
0
02 Aug 2020
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate
Yufeng Zhang
Qi Cai
Zhuoran Yang
Zhaoran Wang
108
12
0
08 Mar 2020
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
K. Zhang
Zhuoran Yang
Tamer Basar
36
1,178
0
24 Nov 2019
Convergent Policy Optimization for Safe Reinforcement Learning
Ming Yu
Zhuoran Yang
Mladen Kolar
Zhaoran Wang
16
91
0
26 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
23
83
0
18 Sep 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
J. Lee
G. Mahajan
11
315
0
01 Aug 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
K. Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
28
186
0
19 Jun 2019
1