ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.07937
  4. Cited By
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural
  Policy Gradient Methods

An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods

15 November 2022
Yanli Liu
K. Zhang
Tamer Basar
W. Yin
ArXivPDFHTML

Papers citing "An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods"

23 / 73 papers shown
Title
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
17
10
0
04 Nov 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth
  Settings
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
6
5
0
30 Oct 2021
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective
Nai-Chieh Huang
Ping-Chun Hsieh
Kuo-Hao Ho
Hsuan-Yu Yao
Kai-Chun Hu
Liang-Chun Ouyang
I-Chen Wu
20
1
0
26 Oct 2021
Actor-critic is implicitly biased towards high entropy optimal policies
Actor-critic is implicitly biased towards high entropy optimal policies
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
52
11
0
21 Oct 2021
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy
  Gradient Methods with Entropy Regularization
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Yuhao Ding
Junzi Zhang
Hyunin Lee
Javad Lavaei
16
18
0
19 Oct 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
19
16
0
19 Oct 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic
  Reinforcement Learning and Global Convergence of Policy Gradient Methods
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
14
6
0
13 Sep 2021
On the Approximation of Cooperative Heterogeneous Multi-Agent
  Reinforcement Learning (MARL) using Mean Field Control (MFC)
On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC)
Washim Uddin Mondal
Mridul Agarwal
Vaneet Aggarwal
S. Ukkusuri
33
43
0
09 Sep 2021
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms
  with Finite-Time Analysis
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
Ziyi Chen
Yi Zhou
Rongrong Chen
Shaofeng Zou
13
24
0
08 Sep 2021
A general sample complexity analysis of vanilla policy gradient
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
64
62
0
23 Jul 2021
Bregman Gradient Policy Optimization
Bregman Gradient Policy Optimization
Feihu Huang
Shangqian Gao
Heng-Chiao Huang
9
16
0
23 Jun 2021
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
OffRL
11
11
0
22 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
14
15
0
15 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy
  Gradient Based Algorithm
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
8
7
0
28 May 2021
A nearly Blackwell-optimal policy gradient method
A nearly Blackwell-optimal policy gradient method
Vektor Dewanto
M. Gallagher
OffRL
8
0
0
28 May 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
43
29
0
26 May 2021
Policy Mirror Descent for Regularized Reinforcement Learning: A
  Generalized Framework with Linear Convergence
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
Wenhao Zhan
Shicong Cen
Baihe Huang
Yuxin Chen
Jason D. Lee
Yuejie Chi
9
76
0
24 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm
On the Linear convergence of Natural Policy Gradient Algorithm
S. Khodadadian
P. Jhunjhunwala
Sushil Mahavir Varma
S. T. Maguluri
22
56
0
04 May 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
24
24
0
23 Feb 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
13
50
0
22 Feb 2021
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
S. Khodadadian
Thinh T. Doan
J. Romberg
S. T. Maguluri
17
42
0
26 Jan 2021
Smoothed functional-based gradient algorithms for off-policy
  reinforcement learning: A non-asymptotic viewpoint
Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint
Nithia Vijayan
A. PrashanthL.
OffRL
11
6
0
06 Jan 2021
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
88
145
0
04 May 2020
Previous
12