Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08383
Cited By
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
19 June 2019
K. Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies"
50 / 111 papers shown
Title
Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic
Qijun Luo
Xiao Li
19
1
0
12 Jun 2022
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Pratap Tokekar
Dinesh Manocha
10
8
0
12 Jun 2022
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Dongsheng Ding
K. Zhang
Jiali Duan
Tamer Bacsar
Mihailo R. Jovanović
18
19
0
06 Jun 2022
Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function
Saeed Masiha
Saber Salehkaleybar
Niao He
Negar Kiyavash
Patrick Thiran
79
18
0
25 May 2022
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
17
20
0
04 Mar 2022
A policy gradient approach for optimization of smooth risk measures
Nithia Vijayan
Prashanth L.A.
OffRL
11
4
0
22 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
30
2
0
15 Feb 2022
Do Differentiable Simulators Give Better Policy Gradients?
H. Suh
Max Simchowitz
K. Zhang
Russ Tedrake
25
94
0
02 Feb 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M. Sadler
Pratap Tokekar
Alec Koppel
41
17
0
28 Jan 2022
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search
Wesley A. Suttle
Alec Koppel
Ji Liu
17
0
0
21 Jan 2022
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
27
165
0
08 Dec 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
22
10
0
04 Nov 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
Matthew Shunshi Zhang
Murat A. Erdogdu
Animesh Garg
14
5
0
30 Oct 2021
Understanding the Effect of Stochasticity in Policy Optimization
Jincheng Mei
Bo Dai
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
11
17
0
29 Oct 2021
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming
Alec Koppel
Amrit Singh Bedi
Bhargav Ganguly
Vaneet Aggarwal
22
4
0
22 Oct 2021
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Yuhao Ding
Junzi Zhang
Hyunin Lee
Javad Lavaei
30
18
0
19 Oct 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
24
16
0
19 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
26
2
0
17 Oct 2021
Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees
Siliang Zeng
Tianyi Chen
Alfredo García
Mingyi Hong
15
11
0
11 Oct 2021
Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback
Ishaan Shah
D. Halpern
Kavosh Asadi
Michael L. Littman
15
0
0
15 Sep 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
16
6
0
13 Sep 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
27
36
0
05 Aug 2021
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
69
62
0
23 Jul 2021
Policy Gradient Methods for Distortion Risk Measures
Nithia Vijayan
Prashanth L.A.
13
5
0
09 Jul 2021
Bregman Gradient Policy Optimization
Feihu Huang
Shangqian Gao
Heng-Chiao Huang
17
16
0
23 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
25
15
0
15 Jun 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation
Anas Barakat
Pascal Bianchi
Julien Lehmann
16
9
0
14 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
8
7
0
28 May 2021
A nearly Blackwell-optimal policy gradient method
Vektor Dewanto
M. Gallagher
OffRL
16
0
0
28 May 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
39
24
0
23 Feb 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
21
50
0
22 Feb 2021
Provable Super-Convergence with a Large Cyclical Learning Rate
Samet Oymak
28
12
0
22 Feb 2021
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
Yulai Zhao
Yuandong Tian
Jason D. Lee
S. Du
OffRL
41
18
0
17 Feb 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
44
67
0
17 Feb 2021
Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint
Nithia Vijayan
A. PrashanthL.
OffRL
19
6
0
06 Jan 2021
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity
K. Zhang
Xiangyuan Zhang
Bin Hu
Tamer Bacsar
16
19
0
04 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen
K. Zhang
Min-Fong Hong
Tianyi Chen
19
28
0
31 Dec 2020
Model Free Reinforcement Learning Algorithm for Stationary Mean field Equilibrium for Multiple Types of Agents
A. Ghosh
Vaneet Aggarwal
21
7
0
31 Dec 2020
Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
Matthieu Zimmer
Claire Glanois
Umer Siddique
Paul Weng
OffRL
10
58
0
17 Dec 2020
Sample Complexity of Policy Gradient Finding Second-Order Stationary Points
Long Yang
Qian Zheng
Gang Pan
15
21
0
02 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
34
121
0
11 Nov 2020
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
Amal Feriani
E. Hossain
22
236
0
06 Nov 2020
A Study of Policy Gradient on a Class of Exactly Solvable Models
Gavin McCracken
Colin Daniels
Rosie Zhao
Anna M. Brandenberger
Prakash Panangaden
Doina Precup
7
0
0
03 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
35
99
0
22 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
13
42
0
02 Aug 2020
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
6
137
0
04 Jul 2020
When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence
Ziwei Guan
Tengyu Xu
Yingbin Liang
19
16
0
24 Jun 2020
Zeroth-order Deterministic Policy Gradient
Harshat Kumar
Dionysios S. Kalogerias
George J. Pappas
Alejandro Ribeiro
OffRL
9
14
0
12 Jun 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms
Tengyu Xu
Zhe Wang
Yingbin Liang
10
57
0
07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
145
0
04 May 2020
Previous
1
2
3
Next