ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01786
  4. Cited By
Global Optimality Guarantees For Policy Gradient Methods

Global Optimality Guarantees For Policy Gradient Methods

5 June 2019
Jalaj Bhandari
Daniel Russo
ArXivPDFHTML

Papers citing "Global Optimality Guarantees For Policy Gradient Methods"

50 / 122 papers shown
Title
Stochastic Policy Gradient Methods: Improved Sample Complexity for
  Fisher-non-degenerate Policies
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
Ilyas Fatkhullin
Anas Barakat
Anastasia Kireeva
Niao He
19
37
0
03 Feb 2023
A Novel Framework for Policy Mirror Descent with General
  Parameterization and Linear Convergence
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
57
15
0
30 Jan 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear
  Function Approximation
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
Uri Sherman
Tomer Koren
Yishay Mansour
32
12
0
30 Jan 2023
Stochastic Dimension-reduced Second-order Methods for Policy
  Optimization
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Jinsong Liu
Chen Xie
Qinwen Deng
Dongdong Ge
Yi-Li Ye
19
1
0
28 Jan 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement
  Learning via Multi-Level Monte Carlo Actor-Critic
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Wesley A. Suttle
Amrit Singh Bedi
Bhrij Patel
Brian M. Sadler
Alec Koppel
Dinesh Manocha
16
14
0
28 Jan 2023
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
Xian Yu
Lei Ying
19
5
0
26 Jan 2023
Variance Reduction for Score Functions Using Optimal Baselines
Variance Reduction for Score Functions Using Optimal Baselines
Ronan L. Keane
H. Gao
16
0
0
27 Dec 2022
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi-An Ma
Sergey Levine
32
14
0
24 Dec 2022
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Hsin-En Su
Yen-Ju Chen
Ping-Chun Hsieh
Xi Liu
OffRL
18
0
0
10 Dec 2022
Design and Planning of Flexible Mobile Micro-Grids Using Deep
  Reinforcement Learning
Design and Planning of Flexible Mobile Micro-Grids Using Deep Reinforcement Learning
Cesare Caputo
Michel-Alexandre Cardin
Pudong Ge
Fei Teng
A. Korre
Ehecatl Antonio del Rio Chanona
14
18
0
08 Dec 2022
Global Convergence of Localized Policy Iteration in Networked
  Multi-Agent Reinforcement Learning
Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning
Yizhou Zhang
Guannan Qu
Pan Xu
Yiheng Lin
Zaiwei Chen
Adam Wierman
29
25
0
30 Nov 2022
Geometry and convergence of natural policy gradient methods
Geometry and convergence of natural policy gradient methods
Johannes Muller
Guido Montúfar
14
10
0
03 Nov 2022
Finite-time analysis of single-timescale actor-critic
Finite-time analysis of single-timescale actor-critic
Xu-yang Chen
Lin Zhao
OffRL
18
20
0
18 Oct 2022
On the convergence of policy gradient methods to Nash equilibria in
  general stochastic games
On the convergence of policy gradient methods to Nash equilibria in general stochastic games
Angeliki Giannou
Kyriakos Lotidis
P. Mertikopoulos
Emmanouil-Vasileios Vlatakis-Gkaragkounis
21
17
0
17 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
  Markov Games
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
51
35
0
03 Oct 2022
Bounded Robustness in Reinforcement Learning via Lexicographic
  Objectives
Bounded Robustness in Reinforcement Learning via Lexicographic Objectives
Daniel Jarne Ornia
Licio Romao
Lewis Hammond
M. Mazo
Alessandro Abate
12
0
0
30 Sep 2022
On the Optimization Landscape of Dynamic Output Feedback: A Case Study
  for Linear Quadratic Regulator
On the Optimization Landscape of Dynamic Output Feedback: A Case Study for Linear Quadratic Regulator
Jingliang Duan
Wenhan Cao
Yanggu Zheng
Lin Zhao
15
3
0
12 Sep 2022
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Fivos Kalogiannis
Ioannis Anagnostides
Ioannis Panageas
Emmanouil-Vasileios Vlatakis-Gkaragkounis
Vaggos Chatziafratis
S. Stavroulakis
31
13
0
03 Aug 2022
Actor-Critic based Improper Reinforcement Learning
Actor-Critic based Improper Reinforcement Learning
Mohammadi Zaki
Avinash Mohan
Aditya Gopalan
Shie Mannor
13
2
0
19 Jul 2022
Contextual Decision Trees
Contextual Decision Trees
Tommaso Aldinucci
Enrico Civitelli
Leonardo Di Gangi
Alessandro Sestini
9
3
0
13 Jul 2022
A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for
  Nonconvex Functional Constrained Optimization
A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization
Songtao Lu
19
18
0
12 Jul 2022
Policy Optimization for Markov Games: Unified Framework and Faster
  Convergence
Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Runyu Zhang
Qinghua Liu
Haiquan Wang
Caiming Xiong
Na Li
Yu Bai
13
26
0
06 Jun 2022
Convergence and sample complexity of natural policy gradient primal-dual
  methods for constrained MDPs
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Dongsheng Ding
K. Zhang
Jiali Duan
Tamer Bacsar
Mihailo R. Jovanović
18
19
0
06 Jun 2022
Policy Gradient Method For Robust Reinforcement Learning
Policy Gradient Method For Robust Reinforcement Learning
Yue Wang
Shaofeng Zou
81
67
0
15 May 2022
Independent Natural Policy Gradient Methods for Potential Games:
  Finite-time Global Convergence with Entropy Regularization
Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization
Shicong Cen
Fan Chen
Yuejie Chi
27
15
0
12 Apr 2022
Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced
  Local Value Functions
Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions
Gangshan Jing
H. Bai
Jemin George
A. Chakrabortty
P. Sharma
17
2
0
26 Feb 2022
Finite-Time Analysis of Natural Actor-Critic for POMDPs
Finite-Time Analysis of Natural Actor-Critic for POMDPs
Semih Cayci
Niao He
R. Srikant
20
1
0
20 Feb 2022
Stochastic linear optimization never overfits with quadratically-bounded
  losses on general data
Stochastic linear optimization never overfits with quadratically-bounded losses on general data
Matus Telgarsky
9
11
0
14 Feb 2022
Independent Policy Gradient for Large-Scale Markov Potential Games:
  Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Dongsheng Ding
Chen-Yu Wei
K. Zhang
M. Jovanović
22
69
0
08 Feb 2022
Do Differentiable Simulators Give Better Policy Gradients?
Do Differentiable Simulators Give Better Policy Gradients?
H. Suh
Max Simchowitz
K. Zhang
Russ Tedrake
25
94
0
02 Feb 2022
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted
  Iteration
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Chengzhuo Ni
Ruiqi Zhang
Xiang Ji
Xuezhou Zhang
Mengdi Wang
OffRL
19
1
0
31 Jan 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M. Sadler
Pratap Tokekar
Alec Koppel
41
17
0
28 Jan 2022
STOPS: Short-Term-based Volatility-controlled Policy Search and its
  Global Convergence
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence
Liang Xu
Daoming Lyu
Yangchen Pan
Aiwen Jiang
Bo Liu
26
0
0
24 Jan 2022
Occupancy Information Ratio: Infinite-Horizon, Information-Directed,
  Parameterized Policy Search
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search
Wesley A. Suttle
Alec Koppel
Ji Liu
23
0
0
21 Jan 2022
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Tianhao Wu
Yunchang Yang
Han Zhong
Liwei Wang
S. Du
Jiantao Jiao
45
14
0
21 Dec 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth
  Settings
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
14
5
0
30 Oct 2021
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective
Nai-Chieh Huang
Ping-Chun Hsieh
Kuo-Hao Ho
Hsuan-Yu Yao
Kai-Chun Hu
Liang-Chun Ouyang
I-Chen Wu
22
1
0
26 Oct 2021
Faster Algorithm and Sharper Analysis for Constrained Markov Decision
  Process
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process
Tianjiao Li
Ziwei Guan
Shaofeng Zou
Tengyu Xu
Yingbin Liang
Guanghui Lan
18
26
0
20 Oct 2021
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy
  Gradient Methods with Entropy Regularization
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Yuhao Ding
Junzi Zhang
Hyunin Lee
Javad Lavaei
30
18
0
19 Oct 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
24
16
0
19 Oct 2021
The Geometry of Memoryless Stochastic Policy Optimization in
  Infinite-Horizon POMDPs
The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs
Johannes Muller
Guido Montúfar
16
8
0
14 Oct 2021
Online Robust Reinforcement Learning with Model Uncertainty
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
96
0
29 Sep 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic
  Reinforcement Learning and Global Convergence of Policy Gradient Methods
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
16
6
0
13 Sep 2021
A Boosting Approach to Reinforcement Learning
A Boosting Approach to Reinforcement Learning
Nataly Brukhim
Elad Hazan
Karan Singh
30
13
0
22 Aug 2021
Global Convergence of the ODE Limit for Online Actor-Critic Algorithms
  in Reinforcement Learning
Global Convergence of the ODE Limit for Online Actor-Critic Algorithms in Reinforcement Learning
Ziheng Wang
Justin A. Sirignano
24
2
0
19 Aug 2021
Greedification Operators for Policy Optimization: Investigating Forward
  and Reverse KL Divergences
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
17
29
0
17 Jul 2021
Curious Explorer: a provable exploration strategy in Policy Learning
Curious Explorer: a provable exploration strategy in Policy Learning
M. Miani
Maurizio Parton
M. Romito
37
0
0
29 Jun 2021
MADE: Exploration via Maximizing Deviation from Explored Regions
MADE: Exploration via Maximizing Deviation from Explored Regions
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
32
42
0
18 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
25
15
0
15 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy
  Gradient Based Algorithm
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
8
7
0
28 May 2021
Previous
123
Next