Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.01786
Cited By
Global Optimality Guarantees For Policy Gradient Methods
5 June 2019
Jalaj Bhandari
Daniel Russo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Global Optimality Guarantees For Policy Gradient Methods"
50 / 122 papers shown
Title
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
Ilyas Fatkhullin
Anas Barakat
Anastasia Kireeva
Niao He
19
37
0
03 Feb 2023
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
57
15
0
30 Jan 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
Uri Sherman
Tomer Koren
Yishay Mansour
32
12
0
30 Jan 2023
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Jinsong Liu
Chen Xie
Qinwen Deng
Dongdong Ge
Yi-Li Ye
19
1
0
28 Jan 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Wesley A. Suttle
Amrit Singh Bedi
Bhrij Patel
Brian M. Sadler
Alec Koppel
Dinesh Manocha
16
14
0
28 Jan 2023
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
Xian Yu
Lei Ying
19
5
0
26 Jan 2023
Variance Reduction for Score Functions Using Optimal Baselines
Ronan L. Keane
H. Gao
16
0
0
27 Dec 2022
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi-An Ma
Sergey Levine
32
14
0
24 Dec 2022
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Hsin-En Su
Yen-Ju Chen
Ping-Chun Hsieh
Xi Liu
OffRL
18
0
0
10 Dec 2022
Design and Planning of Flexible Mobile Micro-Grids Using Deep Reinforcement Learning
Cesare Caputo
Michel-Alexandre Cardin
Pudong Ge
Fei Teng
A. Korre
Ehecatl Antonio del Rio Chanona
14
18
0
08 Dec 2022
Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning
Yizhou Zhang
Guannan Qu
Pan Xu
Yiheng Lin
Zaiwei Chen
Adam Wierman
29
25
0
30 Nov 2022
Geometry and convergence of natural policy gradient methods
Johannes Muller
Guido Montúfar
14
10
0
03 Nov 2022
Finite-time analysis of single-timescale actor-critic
Xu-yang Chen
Lin Zhao
OffRL
18
20
0
18 Oct 2022
On the convergence of policy gradient methods to Nash equilibria in general stochastic games
Angeliki Giannou
Kyriakos Lotidis
P. Mertikopoulos
Emmanouil-Vasileios Vlatakis-Gkaragkounis
21
17
0
17 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
51
35
0
03 Oct 2022
Bounded Robustness in Reinforcement Learning via Lexicographic Objectives
Daniel Jarne Ornia
Licio Romao
Lewis Hammond
M. Mazo
Alessandro Abate
12
0
0
30 Sep 2022
On the Optimization Landscape of Dynamic Output Feedback: A Case Study for Linear Quadratic Regulator
Jingliang Duan
Wenhan Cao
Yanggu Zheng
Lin Zhao
15
3
0
12 Sep 2022
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Fivos Kalogiannis
Ioannis Anagnostides
Ioannis Panageas
Emmanouil-Vasileios Vlatakis-Gkaragkounis
Vaggos Chatziafratis
S. Stavroulakis
31
13
0
03 Aug 2022
Actor-Critic based Improper Reinforcement Learning
Mohammadi Zaki
Avinash Mohan
Aditya Gopalan
Shie Mannor
13
2
0
19 Jul 2022
Contextual Decision Trees
Tommaso Aldinucci
Enrico Civitelli
Leonardo Di Gangi
Alessandro Sestini
9
3
0
13 Jul 2022
A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization
Songtao Lu
19
18
0
12 Jul 2022
Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Runyu Zhang
Qinghua Liu
Haiquan Wang
Caiming Xiong
Na Li
Yu Bai
13
26
0
06 Jun 2022
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Dongsheng Ding
K. Zhang
Jiali Duan
Tamer Bacsar
Mihailo R. Jovanović
18
19
0
06 Jun 2022
Policy Gradient Method For Robust Reinforcement Learning
Yue Wang
Shaofeng Zou
81
67
0
15 May 2022
Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization
Shicong Cen
Fan Chen
Yuejie Chi
27
15
0
12 Apr 2022
Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions
Gangshan Jing
H. Bai
Jemin George
A. Chakrabortty
P. Sharma
17
2
0
26 Feb 2022
Finite-Time Analysis of Natural Actor-Critic for POMDPs
Semih Cayci
Niao He
R. Srikant
20
1
0
20 Feb 2022
Stochastic linear optimization never overfits with quadratically-bounded losses on general data
Matus Telgarsky
9
11
0
14 Feb 2022
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Dongsheng Ding
Chen-Yu Wei
K. Zhang
M. Jovanović
22
69
0
08 Feb 2022
Do Differentiable Simulators Give Better Policy Gradients?
H. Suh
Max Simchowitz
K. Zhang
Russ Tedrake
25
94
0
02 Feb 2022
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Chengzhuo Ni
Ruiqi Zhang
Xiang Ji
Xuezhou Zhang
Mengdi Wang
OffRL
19
1
0
31 Jan 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M. Sadler
Pratap Tokekar
Alec Koppel
41
17
0
28 Jan 2022
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence
Liang Xu
Daoming Lyu
Yangchen Pan
Aiwen Jiang
Bo Liu
26
0
0
24 Jan 2022
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search
Wesley A. Suttle
Alec Koppel
Ji Liu
23
0
0
21 Jan 2022
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Tianhao Wu
Yunchang Yang
Han Zhong
Liwei Wang
S. Du
Jiantao Jiao
45
14
0
21 Dec 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
14
5
0
30 Oct 2021
Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective
Nai-Chieh Huang
Ping-Chun Hsieh
Kuo-Hao Ho
Hsuan-Yu Yao
Kai-Chun Hu
Liang-Chun Ouyang
I-Chen Wu
22
1
0
26 Oct 2021
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process
Tianjiao Li
Ziwei Guan
Shaofeng Zou
Tengyu Xu
Yingbin Liang
Guanghui Lan
18
26
0
20 Oct 2021
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Yuhao Ding
Junzi Zhang
Hyunin Lee
Javad Lavaei
30
18
0
19 Oct 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
24
16
0
19 Oct 2021
The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs
Johannes Muller
Guido Montúfar
16
8
0
14 Oct 2021
Online Robust Reinforcement Learning with Model Uncertainty
Yue Wang
Shaofeng Zou
OOD
OffRL
76
96
0
29 Sep 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
16
6
0
13 Sep 2021
A Boosting Approach to Reinforcement Learning
Nataly Brukhim
Elad Hazan
Karan Singh
30
13
0
22 Aug 2021
Global Convergence of the ODE Limit for Online Actor-Critic Algorithms in Reinforcement Learning
Ziheng Wang
Justin A. Sirignano
24
2
0
19 Aug 2021
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
17
29
0
17 Jul 2021
Curious Explorer: a provable exploration strategy in Policy Learning
M. Miani
Maurizio Parton
M. Romito
37
0
0
29 Jun 2021
MADE: Exploration via Maximizing Deviation from Explored Regions
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
32
42
0
18 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
25
15
0
15 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
8
7
0
28 May 2021
Previous
1
2
3
Next