ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08383
  4. Cited By
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies

Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies

19 June 2019
K. Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
ArXivPDFHTML

Papers citing "Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies"

50 / 111 papers shown
Title
Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch
Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch
Weizhen Wang
Jianping He
Xiaoming Duan
32
0
0
28 Mar 2025
The Lagrangian Method for Solving Constrained Markov Games
Soham Das
Santiago Paternain
Luiz F. O. Chamon
Ceyhun Eksin
45
0
0
13 Mar 2025
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
87
0
0
11 Feb 2025
Spatio-Temporal SIR Model of Pandemic Spread During Warfare with Optimal
  Dual-use Healthcare System Administration using Deep Reinforcement Learning
Spatio-Temporal SIR Model of Pandemic Spread During Warfare with Optimal Dual-use Healthcare System Administration using Deep Reinforcement Learning
Adi Shuchami
Teddy Lazebnik
69
0
0
18 Dec 2024
Structure Matters: Dynamic Policy Gradient
Structure Matters: Dynamic Policy Gradient
Sara Klein
Xiangyuan Zhang
Tamer Basar
Simon Weissmann
Leif Döring
35
0
0
07 Nov 2024
Improved Sample Complexity for Global Convergence of Actor-Critic
  Algorithms
Improved Sample Complexity for Global Convergence of Actor-Critic Algorithms
Navdeep Kumar
Priyank Agrawal
Giorgia Ramponi
Kfir Y. Levy
Shie Mannor
33
0
0
11 Oct 2024
Heavy-Ball Momentum Accelerated Actor-Critic With Function Approximation
Heavy-Ball Momentum Accelerated Actor-Critic With Function Approximation
Yanjie Dong
Haijun Zhang
Gang Wang
Shisheng Cui
Xiping Hu
33
1
0
13 Aug 2024
Enabling High Data Throughput Reinforcement Learning on GPUs: A Domain
  Agnostic Framework for Data-Driven Scientific Research
Enabling High Data Throughput Reinforcement Learning on GPUs: A Domain Agnostic Framework for Data-Driven Scientific Research
Tian Lan
Huan Wang
Caiming Xiong
Silvio Savarese
AI4CE
19
0
0
01 Aug 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
36
2
0
30 May 2024
Safe and Balanced: A Framework for Constrained Multi-Objective
  Reinforcement Learning
Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning
Shangding Gu
Bilgehan Sel
Yuhao Ding
Lu Wang
Qingwei Lin
Alois Knoll
Ming Jin
40
1
0
26 May 2024
Almost sure convergence rates of stochastic gradient methods under gradient domination
Almost sure convergence rates of stochastic gradient methods under gradient domination
Simon Weissmann
Sara Klein
Waïss Azizian
Leif Döring
34
3
0
22 May 2024
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Guangchen Lan
Dong-Jun Han
Abolfazl Hashemi
Vaneet Aggarwal
Christopher G. Brinton
122
15
0
09 Apr 2024
Towards Global Optimality for Practical Average Reward Reinforcement
  Learning without Mixing Time Oracles
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles
Bhrij Patel
Wesley A. Suttle
Alec Koppel
Vaneet Aggarwal
Brian M. Sadler
Amrit Singh Bedi
Dinesh Manocha
32
1
0
18 Mar 2024
Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical
  Systems
Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical Systems
Wesley A. Suttle
Vipul K. Sharma
K. Kosaraju
S. Sivaranjani
Ji Liu
Vijay Gupta
Brian M. Sadler
30
1
0
06 Mar 2024
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Yifan Lin
Yuhao Wang
Enlu Zhou
51
0
0
01 Mar 2024
Stochastic Gradient Succeeds for Bandits
Stochastic Gradient Succeeds for Bandits
Jincheng Mei
Zixin Zhong
Bo Dai
Alekh Agarwal
Csaba Szepesvári
Dale Schuurmans
21
1
0
27 Feb 2024
Principled Penalty-based Methods for Bilevel Reinforcement Learning and
  RLHF
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Han Shen
Zhuoran Yang
Tianyi Chen
OffRL
32
14
0
10 Feb 2024
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning:
  Theory, Algorithms and Implementations
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations
Matthias Lehmann
38
0
0
24 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for
  Regularized Expected Reward Optimization
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
14
0
0
23 Jan 2024
Global Convergence of Natural Policy Gradient with Hessian-aided
  Momentum Variance Reduction
Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction
Jie Feng
Ke Wei
Jinchi Chen
23
1
0
02 Jan 2024
A Large Deviations Perspective on Policy Gradient Algorithms
A Large Deviations Perspective on Policy Gradient Algorithms
Wouter Jongeneel
Daniel Kuhn
Mengmeng Li
11
1
0
13 Nov 2023
On the Second-Order Convergence of Biased Policy Gradient Algorithms
On the Second-Order Convergence of Biased Policy Gradient Algorithms
Siqiao Mu
Diego Klabjan
35
2
0
05 Nov 2023
Model-Based Reparameterization Policy Gradient Methods: Theory and
  Practical Algorithms
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
Shenao Zhang
Boyi Liu
Zhaoran Wang
Tuo Zhao
10
2
0
30 Oct 2023
Optimization Landscape of Policy Gradient Methods for Discrete-time
  Static Output Feedback
Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback
Jingliang Duan
Jie Li
Xuyang Chen
Kai Zhao
Shengbo Eben Li
Lin Zhao
11
5
0
29 Oct 2023
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm
  with General Parameterization for Infinite Horizon Discounted Reward Markov
  Decision Processes
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes
Washim Uddin Mondal
Vaneet Aggarwal
30
9
0
18 Oct 2023
A Fisher-Rao gradient flow for entropy-regularised Markov decision
  processes in Polish spaces
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
Yufei Zhang
16
7
0
04 Oct 2023
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning
  from Human Feedback
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Dinesh Manocha
Huazheng Wang
Mengdi Wang
Furong Huang
23
25
0
03 Aug 2023
On the Global Convergence of Natural Actor-Critic with Two-layer Neural
  Network Parametrization
On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization
Mudit Gaur
Amrit Singh Bedi
Di-di Wang
Vaneet Aggarwal
35
3
0
18 Jun 2023
Achieving Fairness in Multi-Agent Markov Decision Processes Using
  Reinforcement Learning
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning
Peizhong Ju
A. Ghosh
Ness B. Shroff
30
4
0
01 Jun 2023
Policy Optimization for Continuous Reinforcement Learning
Policy Optimization for Continuous Reinforcement Learning
Hanyang Zhao
Wenpin Tang
D. Yao
OffRL
32
17
0
30 May 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
29
3
0
11 May 2023
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning
Mizhaan Prajit Maniyar
Akash Mondal
Prashanth L.A.
S. Bhatnagar
30
0
0
21 Apr 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and
  Global Optimality
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
21
0
0
22 Mar 2023
Revisiting LQR Control from the Perspective of Receding-Horizon Policy
  Gradient
Revisiting LQR Control from the Perspective of Receding-Horizon Policy Gradient
Xiangyuan Zhang
Tamer Basar
28
19
0
25 Feb 2023
Stochastic Policy Gradient Methods: Improved Sample Complexity for
  Fisher-non-degenerate Policies
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
Ilyas Fatkhullin
Anas Barakat
Anastasia Kireeva
Niao He
19
37
0
03 Feb 2023
Stochastic Dimension-reduced Second-order Methods for Policy
  Optimization
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Jinsong Liu
Chen Xie
Qinwen Deng
Dongdong Ge
Yi-Li Ye
19
1
0
28 Jan 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement
  Learning via Multi-Level Monte Carlo Actor-Critic
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Wesley A. Suttle
Amrit Singh Bedi
Bhrij Patel
Brian M. Sadler
Alec Koppel
Dinesh Manocha
16
13
0
28 Jan 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi-An Ma
Sergey Levine
32
14
0
24 Dec 2022
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural
  Policy Gradient Methods
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
Yanli Liu
K. Zhang
Tamer Basar
W. Yin
30
102
0
15 Nov 2022
Geometry and convergence of natural policy gradient methods
Geometry and convergence of natural policy gradient methods
Johannes Muller
Guido Montúfar
8
10
0
03 Nov 2022
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning
  with Parameter Convergence
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence
S. Pattathil
K. Zhang
Asuman Ozdaglar
19
12
0
23 Oct 2022
Finite-time analysis of single-timescale actor-critic
Finite-time analysis of single-timescale actor-critic
Xu-yang Chen
Lin Zhao
OffRL
10
20
0
18 Oct 2022
On the convergence of policy gradient methods to Nash equilibria in
  general stochastic games
On the convergence of policy gradient methods to Nash equilibria in general stochastic games
Angeliki Giannou
Kyriakos Lotidis
P. Mertikopoulos
Emmanouil-Vasileios Vlatakis-Gkaragkounis
13
17
0
17 Oct 2022
Decentralized Policy Gradient for Nash Equilibria Learning of
  General-sum Stochastic Games
Decentralized Policy Gradient for Nash Equilibria Learning of General-sum Stochastic Games
Yan Chen
Taoying Li
16
2
0
14 Oct 2022
RTAW: An Attention Inspired Reinforcement Learning Method for
  Multi-Robot Task Allocation in Warehouse Environments
RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments
Aakriti Agrawal
Amrit Singh Bedi
Dinesh Manocha
32
18
0
13 Sep 2022
Sampling Through the Lens of Sequential Decision Making
Sampling Through the Lens of Sequential Decision Making
J. Dou
Alvin Pan
Runxue Bao
Haiyi Mao
Lei Luo
Zhi-Hong Mao
22
19
0
17 Aug 2022
A Single-Timescale Analysis For Stochastic Approximation With Multiple
  Coupled Sequences
A Single-Timescale Analysis For Stochastic Approximation With Multiple Coupled Sequences
Han Shen
Tianyi Chen
30
15
0
21 Jun 2022
How are policy gradient methods affected by the limits of control?
How are policy gradient methods affected by the limits of control?
Ingvar M. Ziemann
Anastasios Tsiamis
H. Sandberg
Nikolai Matni
25
14
0
14 Jun 2022
Variance Reduction for Policy-Gradient Methods via Empirical Variance
  Minimization
Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization
Maxim Kaledin
Alexander Golubev
Denis Belomestny
OffRL
14
3
0
14 Jun 2022
Achieving Zero Constraint Violation for Constrained Reinforcement
  Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Qinbo Bai
Amrit Singh Bedi
Vaneet Aggarwal
21
20
0
12 Jun 2022
123
Next