ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.10306
  4. Cited By
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy

25 June 2019
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
ArXivPDFHTML

Papers citing "Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy"

24 / 24 papers shown
Title
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
18
2
0
20 Jun 2023
Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement
  Learning
Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning
Yuanquan Hu
Xiaoli Wei
Jun Yan
Heng-Wei Zhang
32
8
0
11 Sep 2022
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
  Markov Games with Structured Transitions
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Shuang Qiu
Xiaohan Wei
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
13
11
0
25 Jul 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
13
24
0
07 Jan 2022
Differentially Private Regret Minimization in Episodic Markov Decision
  Processes
Differentially Private Regret Minimization in Episodic Markov Decision Processes
Sayak Ray Chowdhury
Xingyu Zhou
21
21
0
20 Dec 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
29
111
0
19 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
25
47
0
30 Jul 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
52
29
0
26 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
30
56
0
04 May 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear
  Function Approximation
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
32
52
0
24 Mar 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
69
26
0
18 Feb 2021
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
34
121
0
11 Nov 2020
Proximal Policy Optimization via Enhanced Exploration Efficiency
Proximal Policy Optimization via Enhanced Exploration Efficiency
Junwei Zhang
Zhenghao Zhang
Shuai Han
Shuai Lu
19
41
0
11 Nov 2020
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled
  Wireless Networks: A Tutorial
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
Amal Feriani
E. Hossain
22
236
0
06 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
35
99
0
22 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
15
42
0
02 Aug 2020
Variational Policy Gradient Method for Reinforcement Learning with
  General Utilities
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
14
137
0
04 Jul 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural)
  Actor-Critic Algorithms
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
15
57
0
07 May 2020
Generative Adversarial Imitation Learning with Neural Networks: Global
  Optimality and Convergence Rate
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate
Yufeng Zhang
Qi Cai
Zhuoran Yang
Zhaoran Wang
108
12
0
08 Mar 2020
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and
  Algorithms
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
K. Zhang
Zhuoran Yang
Tamer Basar
36
1,178
0
24 Nov 2019
Convergent Policy Optimization for Safe Reinforcement Learning
Convergent Policy Optimization for Safe Reinforcement Learning
Ming Yu
Zhuoran Yang
Mladen Kolar
Zhaoran Wang
16
91
0
26 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
23
83
0
18 Sep 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and
  Distribution Shift
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
J. Lee
G. Mahajan
11
315
0
01 Aug 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
K. Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
28
186
0
19 Jun 2019
1