ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01786
  4. Cited By
Global Optimality Guarantees For Policy Gradient Methods

Global Optimality Guarantees For Policy Gradient Methods

5 June 2019
Jalaj Bhandari
Daniel Russo
ArXivPDFHTML

Papers citing "Global Optimality Guarantees For Policy Gradient Methods"

50 / 122 papers shown
Title
Remarks on the Polyak-Lojasiewicz inequality and the convergence of gradient systems
Remarks on the Polyak-Lojasiewicz inequality and the convergence of gradient systems
A. C. B. D. Oliveira
Leilei Cui
Eduardo Sontag
36
0
0
31 Mar 2025
Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with a Generative Model
Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with a Generative Model
Zilong Deng
Simon Khan
Shaofeng Zou
54
0
0
11 Mar 2025
Infinite Horizon Markov Economies
Denizalp Goktas
Sadie Zhao
Yiling Chen
A. Greenwald
34
0
0
22 Feb 2025
FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF
FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF
Flint Xiaofeng Fan
Cheston Tan
Yew-Soon Ong
Roger Wattenhofer
Wei Tsang Ooi
85
1
0
20 Dec 2024
Monte Carlo Tree Search with Spectral Expansion for Planning with
  Dynamical Systems
Monte Carlo Tree Search with Spectral Expansion for Planning with Dynamical Systems
Benjamin Rivière
John Lathrop
Soon-Jo Chung
72
1
0
15 Dec 2024
RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasoner
RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasoner
Fu-Chieh Chang
Yu-Ting Lee
Hui-Ying Shih
Pei-Yuan Wu
Pei-Yuan Wu
OffRL
LRM
142
0
0
31 Oct 2024
Improved Sample Complexity for Global Convergence of Actor-Critic
  Algorithms
Improved Sample Complexity for Global Convergence of Actor-Critic Algorithms
Navdeep Kumar
Priyank Agrawal
Giorgia Ramponi
Kfir Y. Levy
Shie Mannor
33
0
0
11 Oct 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Zhaolin Gao
Wenhao Zhan
Jonathan D. Chang
Gokul Swamy
Kianté Brantley
Jason D. Lee
Wen Sun
OffRL
58
3
0
06 Oct 2024
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Toshinori Kitamura
Tadashi Kozuno
Wataru Kumagai
Kenta Hoshino
Y. Hosoe
Kazumi Kasaura
Masashi Hamaya
Paavo Parmas
Yutaka Matsuo
72
0
0
29 Aug 2024
Functional Acceleration for Policy Mirror Descent
Functional Acceleration for Policy Mirror Descent
Veronica Chelu
Doina Precup
28
0
0
23 Jul 2024
Last-Iterate Global Convergence of Policy Gradients for Constrained
  Reinforcement Learning
Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning
Alessandro Montenegro
Marco Mussi
Matteo Papini
Alberto Maria Metelli
BDL
38
1
0
15 Jul 2024
Building Socially-Equitable Public Models
Building Socially-Equitable Public Models
Yejia Liu
Jianyi Yang
Pengfei Li
Tongxin Li
Shaolei Ren
OffRL
42
0
0
04 Jun 2024
Performance of NPG in Countable State-Space Average-Cost RL
Performance of NPG in Countable State-Space Average-Cost RL
Yashaswini Murthy
Isaac Grosof
S. T. Maguluri
R. Srikant
OffRL
29
1
0
30 May 2024
Recurrent Natural Policy Gradient for POMDPs
Recurrent Natural Policy Gradient for POMDPs
Semih Cayci
A. Eryilmaz
24
0
0
28 May 2024
Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence
Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence
Minheng Xiao
Xian Yu
Lei Ying
32
2
0
23 May 2024
Fast Stochastic Policy Gradient: Negative Momentum for Reinforcement
  Learning
Fast Stochastic Policy Gradient: Negative Momentum for Reinforcement Learning
Haobin Zhang
Zhuang Yang
27
0
0
08 May 2024
Linear Convergence of Independent Natural Policy Gradient in Games with
  Entropy Regularization
Linear Convergence of Independent Natural Policy Gradient in Games with Entropy Regularization
Youbang Sun
Tao-Wen Liu
P. R. Kumar
Shahin Shahrampour
37
0
0
04 May 2024
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Alessandro Montenegro
Marco Mussi
Alberto Maria Metelli
Matteo Papini
42
2
0
03 May 2024
Policy Mirror Descent with Lookahead
Policy Mirror Descent with Lookahead
Kimon Protopapas
Anas Barakat
24
1
0
21 Mar 2024
Towards Global Optimality for Practical Average Reward Reinforcement
  Learning without Mixing Time Oracles
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles
Bhrij Patel
Wesley A. Suttle
Alec Koppel
Vaneet Aggarwal
Brian M. Sadler
Amrit Singh Bedi
Dinesh Manocha
32
1
0
18 Mar 2024
On the Global Convergence of Policy Gradient in Average Reward Markov
  Decision Processes
On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes
Navdeep Kumar
Yashaswini Murthy
Itai Shufaro
Kfir Y. Levy
R. Srikant
Shie Mannor
34
2
0
11 Mar 2024
Principled Penalty-based Methods for Bilevel Reinforcement Learning and
  RLHF
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Han Shen
Zhuoran Yang
Tianyi Chen
OffRL
32
14
0
10 Feb 2024
Efficient Reinforcement Learning for Routing Jobs in Heterogeneous
  Queueing Systems
Efficient Reinforcement Learning for Routing Jobs in Heterogeneous Queueing Systems
Neharika Jali
Guannan Qu
Weina Wang
Gauri Joshi
11
5
0
02 Feb 2024
Behind the Myth of Exploration in Policy Gradients
Behind the Myth of Exploration in Policy Gradients
Adrien Bolland
Gaspard Lambrechts
Damien Ernst
51
0
0
31 Jan 2024
PPO-Clip Attains Global Optimality: Towards Deeper Understandings of
  Clipping
PPO-Clip Attains Global Optimality: Towards Deeper Understandings of Clipping
Nai-Chieh Huang
Ping-Chun Hsieh
Kuo-Hao Ho
I-Chen Wu
16
8
0
19 Dec 2023
Fast Policy Learning for Linear Quadratic Control with Entropy
  Regularization
Fast Policy Learning for Linear Quadratic Control with Entropy Regularization
Xin Guo
Xinyu Li
Renyuan Xu
34
3
0
23 Nov 2023
A Large Deviations Perspective on Policy Gradient Algorithms
A Large Deviations Perspective on Policy Gradient Algorithms
Wouter Jongeneel
Daniel Kuhn
Mengmeng Li
20
1
0
13 Nov 2023
On the Second-Order Convergence of Biased Policy Gradient Algorithms
On the Second-Order Convergence of Biased Policy Gradient Algorithms
Siqiao Mu
Diego Klabjan
43
2
0
05 Nov 2023
Model-Based Reparameterization Policy Gradient Methods: Theory and
  Practical Algorithms
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
Shenao Zhang
Boyi Liu
Zhaoran Wang
Tuo Zhao
16
2
0
30 Oct 2023
Optimization Landscape of Policy Gradient Methods for Discrete-time
  Static Output Feedback
Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback
Jingliang Duan
Jie Li
Xuyang Chen
Kai Zhao
Shengbo Eben Li
Lin Zhao
11
5
0
29 Oct 2023
Weakly Coupled Deep Q-Networks
Weakly Coupled Deep Q-Networks
Ibrahim El Shar
Daniel R. Jiang
19
2
0
28 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
27
5
0
09 Oct 2023
Global Convergence of Policy Gradient Methods in Reinforcement Learning,
  Games and Control
Global Convergence of Policy Gradient Methods in Reinforcement Learning, Games and Control
Shicong Cen
Yuejie Chi
42
1
0
08 Oct 2023
A Fisher-Rao gradient flow for entropy-regularised Markov decision
  processes in Polish spaces
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
Yufei Zhang
16
7
0
04 Oct 2023
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy
  Gradient Methods
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
Sara Klein
Simon Weissmann
Leif Döring
24
7
0
04 Oct 2023
Rate-Optimal Policy Optimization for Linear Markov Decision Processes
Rate-Optimal Policy Optimization for Linear Markov Decision Processes
Uri Sherman
Alon Cohen
Tomer Koren
Yishay Mansour
33
7
0
28 Aug 2023
Submodular Reinforcement Learning
Submodular Reinforcement Learning
Manish Prajapat
Mojmír Mutný
M. Zeilinger
Andreas Krause
OffRL
28
12
0
25 Jul 2023
An Analysis of Multi-Agent Reinforcement Learning for Decentralized
  Inventory Control Systems
An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems
Marwan Mousa
Damien van de Berg
Niki Kotecha
Ehecatl Antonio del Rio Chanona
M. Mowbray
10
13
0
21 Jul 2023
Acceleration in Policy Optimization
Acceleration in Policy Optimization
Veronica Chelu
Tom Zahavy
A. Guez
Doina Precup
Sebastian Flennerhag
43
0
0
18 Jun 2023
On the Global Convergence of Natural Actor-Critic with Two-layer Neural
  Network Parametrization
On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization
Mudit Gaur
Amrit Singh Bedi
Di-di Wang
Vaneet Aggarwal
35
3
0
18 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High
  Dimensions
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
25
3
0
17 Jun 2023
Low-Switching Policy Gradient with Exploration via Online Sensitivity
  Sampling
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
Yunfan Li
Yiran Wang
Y. Cheng
Lin F. Yang
OffRL
29
4
0
15 Jun 2023
Solving Robust MDPs through No-Regret Dynamics
Solving Robust MDPs through No-Regret Dynamics
E. Guha
27
0
0
30 May 2023
Optimistic Natural Policy Gradient: a Simple Efficient Policy
  Optimization Framework for Online RL
Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL
Qinghua Liu
Gellert Weisz
András Gyorgy
Chi Jin
Csaba Szepesvári
OffRL
21
8
0
18 May 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
33
3
0
11 May 2023
Connected Superlevel Set in (Deep) Reinforcement Learning and its
  Application to Minimax Theorems
Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems
Sihan Zeng
Thinh T. Doan
J. Romberg
OffRL
27
3
0
23 Mar 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and
  Global Optimality
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
21
0
0
22 Mar 2023
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted
  Markov Decision Processes
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
26
11
0
22 Feb 2023
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
Brendan O'Donoghue
OffRL
27
6
0
18 Feb 2023
Digital Twin-Aided Learning for Managing Reconfigurable Intelligent
  Surface-Assisted, Uplink, User-Centric Cell-Free Systems
Digital Twin-Aided Learning for Managing Reconfigurable Intelligent Surface-Assisted, Uplink, User-Centric Cell-Free Systems
Ying-Kai Cui
Tiejun Lv
Wei Ni
Abbas Jamalipour
16
6
0
10 Feb 2023
123
Next