Global Optimality Guarantees For Policy Gradient Methods

5 June 2019

Papers citing "Global Optimality Guarantees For Policy Gradient Methods"

50 / 122 papers shown

Title
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies Ilyas Fatkhullin Anas Barakat Anastasia Kireeva Niao He 19 37 0 03 Feb 2023
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence Carlo Alfano Rui Yuan Patrick Rebeschini 57 15 0 30 Jan 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation Uri Sherman Tomer Koren Yishay Mansour 32 12 0 30 Jan 2023
Stochastic Dimension-reduced Second-order Methods for Policy Optimization Jinsong Liu Chen Xie Qinwen Deng Dongdong Ge Yi-Li Ye 19 1 0 28 Jan 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic Wesley A. Suttle Amrit Singh Bedi Bhrij Patel Brian M. Sadler Alec Koppel Dinesh Manocha 16 14 0 28 Jan 2023
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures Xian Yu Lei Ying 19 5 0 26 Jan 2023
Variance Reduction for Score Functions Using Optimal Baselines Ronan L. Keane H. Gao 16 0 0 27 Dec 2022
Understanding the Complexity Gains of Single-Task RL with a Curriculum Qiyang Li Yuexiang Zhai Yi-An Ma Sergey Levine 32 14 0 24 Dec 2022
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees Hsin-En Su Yen-Ju Chen Ping-Chun Hsieh Xi Liu OffRL 18 0 0 10 Dec 2022
Design and Planning of Flexible Mobile Micro-Grids Using Deep Reinforcement Learning Cesare Caputo Michel-Alexandre Cardin Pudong Ge Fei Teng A. Korre Ehecatl Antonio del Rio Chanona 14 18 0 08 Dec 2022
Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning Yizhou Zhang Guannan Qu Pan Xu Yiheng Lin Zaiwei Chen Adam Wierman 29 25 0 30 Nov 2022
Geometry and convergence of natural policy gradient methods Johannes Muller Guido Montúfar 14 10 0 03 Nov 2022
Finite-time analysis of single-timescale actor-critic Xu-yang Chen Lin Zhao OffRL 18 20 0 18 Oct 2022
On the convergence of policy gradient methods to Nash equilibria in general stochastic games Angeliki Giannou Kyriakos Lotidis P. Mertikopoulos Emmanouil-Vasileios Vlatakis-Gkaragkounis 21 17 0 17 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games Shicong Cen Yuejie Chi S. Du Lin Xiao 51 35 0 03 Oct 2022
Bounded Robustness in Reinforcement Learning via Lexicographic Objectives Daniel Jarne Ornia Licio Romao Lewis Hammond M. Mazo Alessandro Abate 12 0 0 30 Sep 2022
On the Optimization Landscape of Dynamic Output Feedback: A Case Study for Linear Quadratic Regulator Jingliang Duan Wenhan Cao Yanggu Zheng Lin Zhao 15 3 0 12 Sep 2022
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games Fivos Kalogiannis Ioannis Anagnostides Ioannis Panageas Emmanouil-Vasileios Vlatakis-Gkaragkounis Vaggos Chatziafratis S. Stavroulakis 31 13 0 03 Aug 2022
Actor-Critic based Improper Reinforcement Learning Mohammadi Zaki Avinash Mohan Aditya Gopalan Shie Mannor 13 2 0 19 Jul 2022
Contextual Decision Trees Tommaso Aldinucci Enrico Civitelli Leonardo Di Gangi Alessandro Sestini 9 3 0 13 Jul 2022
A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization Songtao Lu 19 18 0 12 Jul 2022
Policy Optimization for Markov Games: Unified Framework and Faster Convergence Runyu Zhang Qinghua Liu Haiquan Wang Caiming Xiong Na Li Yu Bai 13 26 0 06 Jun 2022
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs Dongsheng Ding K. Zhang Jiali Duan Tamer Bacsar Mihailo R. Jovanović 18 19 0 06 Jun 2022
Policy Gradient Method For Robust Reinforcement Learning Yue Wang Shaofeng Zou 81 67 0 15 May 2022
Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization Shicong Cen Fan Chen Yuejie Chi 27 15 0 12 Apr 2022
Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions Gangshan Jing H. Bai Jemin George A. Chakrabortty P. Sharma 17 2 0 26 Feb 2022
Finite-Time Analysis of Natural Actor-Critic for POMDPs Semih Cayci Niao He R. Srikant 20 1 0 20 Feb 2022
Stochastic linear optimization never overfits with quadratically-bounded losses on general data Matus Telgarsky 9 11 0 14 Feb 2022
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence Dongsheng Ding Chen-Yu Wei K. Zhang M. Jovanović 22 69 0 08 Feb 2022
Do Differentiable Simulators Give Better Policy Gradients? H. Suh Max Simchowitz K. Zhang Russ Tedrake 25 94 0 02 Feb 2022
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration Chengzhuo Ni Ruiqi Zhang Xiang Ji Xuezhou Zhang Mengdi Wang OffRL 19 1 0 31 Jan 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces Amrit Singh Bedi Souradip Chakraborty Anjaly Parayil Brian M. Sadler Pratap Tokekar Alec Koppel 41 17 0 28 Jan 2022
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence Liang Xu Daoming Lyu Yangchen Pan Aiwen Jiang Bo Liu 26 0 0 24 Jan 2022
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search Wesley A. Suttle Alec Koppel Ji Liu 23 0 0 21 Jan 2022
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee Tianhao Wu Yunchang Yang Han Zhong Liwei Wang S. Du Jiantao Jiao 45 14 0 21 Dec 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings Matthew Shunshi Zhang Murat A. Erdogdu Animesh Garg 14 5 0 30 Oct 2021
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective Nai-Chieh Huang Ping-Chun Hsieh Kuo-Hao Ho Hsuan-Yu Yao Kai-Chun Hu Liang-Chun Ouyang I-Chen Wu 22 1 0 26 Oct 2021
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process Tianjiao Li Ziwei Guan Shaofeng Zou Tengyu Xu Yingbin Liang Guanghui Lan 18 26 0 20 Oct 2021
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization Yuhao Ding Junzi Zhang Hyunin Lee Javad Lavaei 30 18 0 19 Oct 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient Yuhao Ding Junzi Zhang Javad Lavaei 24 16 0 19 Oct 2021
The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs Johannes Muller Guido Montúfar 16 8 0 14 Oct 2021
Online Robust Reinforcement Learning with Model Uncertainty Yue Wang Shaofeng Zou OOD OffRL 76 96 0 29 Sep 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods Xin Guo Anran Hu Junzi Zhang OffRL 16 6 0 13 Sep 2021
A Boosting Approach to Reinforcement Learning Nataly Brukhim Elad Hazan Karan Singh 30 13 0 22 Aug 2021
Global Convergence of the ODE Limit for Online Actor-Critic Algorithms in Reinforcement Learning Ziheng Wang Justin A. Sirignano 24 2 0 19 Aug 2021
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences Alan Chan Hugo Silva Sungsu Lim Tadashi Kozuno A. R. Mahmood Martha White 17 29 0 17 Jul 2021
Curious Explorer: a provable exploration strategy in Policy Learning M. Miani Maurizio Parton M. Romito 37 0 0 29 Jun 2021
MADE: Exploration via Maximizing Deviation from Explored Regions Tianjun Zhang Paria Rashidinejad Jiantao Jiao Yuandong Tian Joseph E. Gonzalez Stuart J. Russell OffRL 32 42 0 18 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control Amrit Singh Bedi Anjaly Parayil Junyu Zhang Mengdi Wang Alec Koppel 25 15 0 15 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm Qinbo Bai Mridul Agarwal Vaneet Aggarwal 8 7 0 28 May 2021