Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies

19 June 2019

Papers citing "Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies"

50 / 111 papers shown

Title
Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch Weizhen Wang Jianping He Xiaoming Duan 32 0 0 28 Mar 2025
The Lagrangian Method for Solving Constrained Markov Games Soham Das Santiago Paternain Luiz F. O. Chamon Ceyhun Eksin 45 0 0 13 Mar 2025
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates Jincheng Mei Bo Dai Alekh Agarwal Sharan Vaswani Anant Raj Csaba Szepesvári Dale Schuurmans 87 0 0 11 Feb 2025
Spatio-Temporal SIR Model of Pandemic Spread During Warfare with Optimal Dual-use Healthcare System Administration using Deep Reinforcement Learning Adi Shuchami Teddy Lazebnik 69 0 0 18 Dec 2024
Structure Matters: Dynamic Policy Gradient Sara Klein Xiangyuan Zhang Tamer Basar Simon Weissmann Leif Döring 35 0 0 07 Nov 2024
Improved Sample Complexity for Global Convergence of Actor-Critic Algorithms Navdeep Kumar Priyank Agrawal Giorgia Ramponi Kfir Y. Levy Shie Mannor 33 0 0 11 Oct 2024
Heavy-Ball Momentum Accelerated Actor-Critic With Function Approximation Yanjie Dong Haijun Zhang Gang Wang Shisheng Cui Xiping Hu 33 1 0 13 Aug 2024
Enabling High Data Throughput Reinforcement Learning on GPUs: A Domain Agnostic Framework for Data-Driven Scientific Research Tian Lan Huan Wang Caiming Xiong Silvio Savarese AI4CE 19 0 0 01 Aug 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity Yan Yang Bin Gao Ya-xiang Yuan 36 2 0 30 May 2024
Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning Shangding Gu Bilgehan Sel Yuhao Ding Lu Wang Qingwei Lin Alois Knoll Ming Jin 40 1 0 26 May 2024
Almost sure convergence rates of stochastic gradient methods under gradient domination Simon Weissmann Sara Klein Waïss Azizian Leif Döring 34 3 0 22 May 2024
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis Guangchen Lan Dong-Jun Han Abolfazl Hashemi Vaneet Aggarwal Christopher G. Brinton 122 15 0 09 Apr 2024
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles Bhrij Patel Wesley A. Suttle Alec Koppel Vaneet Aggarwal Brian M. Sadler Amrit Singh Bedi Dinesh Manocha 32 1 0 18 Mar 2024
Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical Systems Wesley A. Suttle Vipul K. Sharma K. Kosaraju S. Sivaranjani Ji Liu Vijay Gupta Brian M. Sadler 30 1 0 06 Mar 2024
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate Yifan Lin Yuhao Wang Enlu Zhou 51 0 0 01 Mar 2024
Stochastic Gradient Succeeds for Bandits Jincheng Mei Zixin Zhong Bo Dai Alekh Agarwal Csaba Szepesvári Dale Schuurmans 21 1 0 27 Feb 2024
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF Han Shen Zhuoran Yang Tianyi Chen OffRL 32 14 0 10 Feb 2024
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations Matthias Lehmann 38 0 0 24 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization Ling Liang Haizhao Yang 14 0 0 23 Jan 2024
Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction Jie Feng Ke Wei Jinchi Chen 23 1 0 02 Jan 2024
A Large Deviations Perspective on Policy Gradient Algorithms Wouter Jongeneel Daniel Kuhn Mengmeng Li 11 1 0 13 Nov 2023
On the Second-Order Convergence of Biased Policy Gradient Algorithms Siqiao Mu Diego Klabjan 35 2 0 05 Nov 2023
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms Shenao Zhang Boyi Liu Zhaoran Wang Tuo Zhao 10 2 0 30 Oct 2023
Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback Jingliang Duan Jie Li Xuyang Chen Kai Zhao Shengbo Eben Li Lin Zhao 11 5 0 29 Oct 2023
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes Washim Uddin Mondal Vaneet Aggarwal 30 9 0 18 Oct 2023
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces B. Kerimkulov J. Leahy David Siska Lukasz Szpruch Yufei Zhang 16 7 0 04 Oct 2023
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback Souradip Chakraborty Amrit Singh Bedi Alec Koppel Dinesh Manocha Huazheng Wang Mengdi Wang Furong Huang 23 25 0 03 Aug 2023
On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization Mudit Gaur Amrit Singh Bedi Di-di Wang Vaneet Aggarwal 35 3 0 18 Jun 2023
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning Peizhong Ju A. Ghosh Ness B. Shroff 30 4 0 01 Jun 2023
Policy Optimization for Continuous Reinforcement Learning Hanyang Zhao Wenpin Tang D. Yao OffRL 32 17 0 30 May 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation Adrien Bolland Gilles Louppe D. Ernst 29 3 0 11 May 2023
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning Mizhaan Prajit Maniyar Akash Mondal Prashanth L.A. S. Bhatnagar 30 0 0 21 Apr 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality François Ged M. H. Veiga 21 0 0 22 Mar 2023
Revisiting LQR Control from the Perspective of Receding-Horizon Policy Gradient Xiangyuan Zhang Tamer Basar 28 19 0 25 Feb 2023
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies Ilyas Fatkhullin Anas Barakat Anastasia Kireeva Niao He 19 37 0 03 Feb 2023
Stochastic Dimension-reduced Second-order Methods for Policy Optimization Jinsong Liu Chen Xie Qinwen Deng Dongdong Ge Yi-Li Ye 19 1 0 28 Jan 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic Wesley A. Suttle Amrit Singh Bedi Bhrij Patel Brian M. Sadler Alec Koppel Dinesh Manocha 16 13 0 28 Jan 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum Qiyang Li Yuexiang Zhai Yi-An Ma Sergey Levine 32 14 0 24 Dec 2022
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods Yanli Liu K. Zhang Tamer Basar W. Yin 30 102 0 15 Nov 2022
Geometry and convergence of natural policy gradient methods Johannes Muller Guido Montúfar 8 10 0 03 Nov 2022
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence S. Pattathil K. Zhang Asuman Ozdaglar 19 12 0 23 Oct 2022
Finite-time analysis of single-timescale actor-critic Xu-yang Chen Lin Zhao OffRL 10 20 0 18 Oct 2022
On the convergence of policy gradient methods to Nash equilibria in general stochastic games Angeliki Giannou Kyriakos Lotidis P. Mertikopoulos Emmanouil-Vasileios Vlatakis-Gkaragkounis 13 17 0 17 Oct 2022
Decentralized Policy Gradient for Nash Equilibria Learning of General-sum Stochastic Games Yan Chen Taoying Li 16 2 0 14 Oct 2022
RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments Aakriti Agrawal Amrit Singh Bedi Dinesh Manocha 32 18 0 13 Sep 2022
Sampling Through the Lens of Sequential Decision Making J. Dou Alvin Pan Runxue Bao Haiyi Mao Lei Luo Zhi-Hong Mao 22 19 0 17 Aug 2022
A Single-Timescale Analysis For Stochastic Approximation With Multiple Coupled Sequences Han Shen Tianyi Chen 30 15 0 21 Jun 2022
How are policy gradient methods affected by the limits of control? Ingvar M. Ziemann Anastasios Tsiamis H. Sandberg Nikolai Matni 25 14 0 14 Jun 2022
Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization Maxim Kaledin Alexander Golubev Denis Belomestny OffRL 14 3 0 14 Jun 2022
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm Qinbo Bai Amrit Singh Bedi Vaneet Aggarwal 21 20 0 12 Jun 2022