ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.06392
  4. Cited By
On the Global Convergence Rates of Softmax Policy Gradient Methods

On the Global Convergence Rates of Softmax Policy Gradient Methods

13 May 2020
Jincheng Mei
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
ArXivPDFHTML

Papers citing "On the Global Convergence Rates of Softmax Policy Gradient Methods"

50 / 185 papers shown
Title
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm
  with General Parameterization for Infinite Horizon Discounted Reward Markov
  Decision Processes
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes
Washim Uddin Mondal
Vaneet Aggarwal
30
9
0
18 Oct 2023
Provably Fast Convergence of Independent Natural Policy Gradient for
  Markov Potential Games
Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games
Youbang Sun
Tao-Wen Liu
Ruida Zhou
P. R. Kumar
Shahin Shahrampour
28
11
0
15 Oct 2023
Global Convergence of Policy Gradient Methods in Reinforcement Learning,
  Games and Control
Global Convergence of Policy Gradient Methods in Reinforcement Learning, Games and Control
Shicong Cen
Yuejie Chi
42
1
0
08 Oct 2023
A Fisher-Rao gradient flow for entropy-regularised Markov decision
  processes in Polish spaces
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
Yufei Zhang
16
7
0
04 Oct 2023
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy
  Gradient Methods
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
Sara Klein
Simon Weissmann
Leif Döring
21
7
0
04 Oct 2023
Limits of Actor-Critic Algorithms for Decision Tree Policies Learning in
  IBMDPs
Limits of Actor-Critic Algorithms for Decision Tree Policies Learning in IBMDPs
Hector Kohler
R. Akrour
Philippe Preux
OffRL
14
2
0
23 Sep 2023
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning
  from Human Feedback
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Dinesh Manocha
Huazheng Wang
Mengdi Wang
Furong Huang
23
25
0
03 Aug 2023
Natural Actor-Critic for Robust Reinforcement Learning with Function
  Approximation
Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation
Ruida Zhou
Tao-Wen Liu
Min Cheng
D. Kalathil
P. R. Kumar
Chao Tian
35
19
0
17 Jul 2023
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Chanwoo Park
K. Zhang
Asuman Ozdaglar
28
8
0
13 Jul 2023
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
18
2
0
20 Jun 2023
Acceleration in Policy Optimization
Acceleration in Policy Optimization
Veronica Chelu
Tom Zahavy
A. Guez
Doina Precup
Sebastian Flennerhag
33
0
0
18 Jun 2023
Identifiability and Generalizability in Constrained Inverse
  Reinforcement Learning
Identifiability and Generalizability in Constrained Inverse Reinforcement Learning
Andreas Schlaginhaufen
Maryam Kamgarpour
18
10
0
01 Jun 2023
Achieving Fairness in Multi-Agent Markov Decision Processes Using
  Reinforcement Learning
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning
Peizhong Ju
A. Ghosh
Ness B. Shroff
30
4
0
01 Jun 2023
On the Linear Convergence of Policy Gradient under Hadamard
  Parameterization
On the Linear Convergence of Policy Gradient under Hadamard Parameterization
Jiacai Liu
Jinchi Chen
Ke Wei
16
2
0
31 May 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical
  Guarantees
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
18
3
0
24 May 2023
Connected Superlevel Set in (Deep) Reinforcement Learning and its
  Application to Minimax Theorems
Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems
Sihan Zeng
Thinh T. Doan
J. Romberg
OffRL
22
3
0
23 Mar 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and
  Global Optimality
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
21
0
0
22 Mar 2023
Convergence Rates for Localized Actor-Critic in Networked Markov
  Potential Games
Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games
Zhaoyi Zhou
Zaiwei Chen
Yiheng Lin
Adam Wierman
32
7
0
08 Mar 2023
Finding Regularized Competitive Equilibria of Heterogeneous Agent
  Macroeconomic Models with Reinforcement Learning
Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning
Ruitu Xu
Yifei Min
Tianhao Wang
Zhaoran Wang
Michael I. Jordan
Zhuoran Yang
28
6
0
24 Feb 2023
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
Brendan O'Donoghue
OffRL
27
6
0
18 Feb 2023
Stochastic Policy Gradient Methods: Improved Sample Complexity for
  Fisher-non-degenerate Policies
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
Ilyas Fatkhullin
Anas Barakat
Anastasia Kireeva
Niao He
19
37
0
03 Feb 2023
Performance Bounds for Policy-Based Average Reward Reinforcement
  Learning Algorithms
Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms
Yashaswini Murthy
Mehrdad Moharrami
R. Srikant
OffRL
16
5
0
02 Feb 2023
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree
  Search
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search
Gal Dalal
Assaf Hallak
Gugan Thoppe
Shie Mannor
Gal Chechik
22
3
0
30 Jan 2023
A Novel Framework for Policy Mirror Descent with General
  Parameterization and Linear Convergence
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
54
15
0
30 Jan 2023
Fast Computation of Optimal Transport via Entropy-Regularized
  Extragradient Methods
Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods
Gen Li
Yanxi Chen
Yu Huang
Yuejie Chi
H. Vincent Poor
Yuxin Chen
OT
41
5
0
30 Jan 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement
  Learning via Multi-Level Monte Carlo Actor-Critic
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Wesley A. Suttle
Amrit Singh Bedi
Bhrij Patel
Brian M. Sadler
Alec Koppel
Dinesh Manocha
16
13
0
28 Jan 2023
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
Xian Yu
Lei Ying
11
5
0
26 Jan 2023
The Role of Baselines in Policy Gradient Optimization
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
16
15
0
16 Jan 2023
Policy Mirror Ascent for Efficient and Independent Learning in Mean
  Field Games
Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games
Batuhan Yardim
Semih Cayci
M. Geist
Niao He
51
27
0
29 Dec 2022
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi-An Ma
Sergey Levine
32
14
0
24 Dec 2022
Policy Gradient in Robust MDPs with Global Convergence Guarantee
Policy Gradient in Robust MDPs with Global Convergence Guarantee
Qiuhao Wang
C. Ho
Marek Petrik
19
24
0
20 Dec 2022
Robust Policy Optimization in Deep Reinforcement Learning
Robust Policy Optimization in Deep Reinforcement Learning
Md Masudur Rahman
Yexiang Xue
9
9
0
14 Dec 2022
Scalable and Sample Efficient Distributed Policy Gradient Algorithms in
  Multi-Agent Networked Systems
Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems
Xin Liu
Honghao Wei
Lei Ying
18
6
0
13 Dec 2022
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Hsin-En Su
Yen-Ju Chen
Ping-Chun Hsieh
Xi Liu
OffRL
13
0
0
10 Dec 2022
Global Convergence of Localized Policy Iteration in Networked
  Multi-Agent Reinforcement Learning
Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning
Yizhou Zhang
Guannan Qu
Pan Xu
Yiheng Lin
Zaiwei Chen
Adam Wierman
21
25
0
30 Nov 2022
Geometry and convergence of natural policy gradient methods
Geometry and convergence of natural policy gradient methods
Johannes Muller
Guido Montúfar
8
9
0
03 Nov 2022
Convergence of policy gradient methods for finite-horizon exploratory
  linear-quadratic control problems
Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems
Michael Giegrich
Christoph Reisinger
Yufei Zhang
16
11
0
01 Nov 2022
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning
  with Parameter Convergence
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence
S. Pattathil
K. Zhang
Asuman Ozdaglar
19
12
0
23 Oct 2022
On the connection between Bregman divergence and value in regularized
  Markov decision processes
On the connection between Bregman divergence and value in regularized Markov decision processes
Brendan O'Donoghue
OffRL
19
2
0
21 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic
  Gradient Descent
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
19
4
0
13 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
  Markov Games
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
48
35
0
03 Oct 2022
Linear Convergence for Natural Policy Gradient with Log-linear Policy
  Parametrization
Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization
Carlo Alfano
Patrick Rebeschini
49
13
0
30 Sep 2022
SoftTreeMax: Policy Gradient with Tree Search
SoftTreeMax: Policy Gradient with Tree Search
Gal Dalal
Assaf Hallak
Shie Mannor
Gal Chechik
11
1
0
28 Sep 2022
Robust Constrained Reinforcement Learning
Robust Constrained Reinforcement Learning
Yue Wang
Fei Miao
Shaofeng Zou
21
12
0
14 Sep 2022
Boosted Off-Policy Learning
Boosted Off-Policy Learning
Ben London
Levi Lu
Ted Sandler
Thorsten Joachims
OffRL
33
4
0
01 Aug 2022
Actor-Critic based Improper Reinforcement Learning
Actor-Critic based Improper Reinforcement Learning
Mohammadi Zaki
Avinash Mohan
Aditya Gopalan
Shie Mannor
11
2
0
19 Jul 2022
Towards Global Optimality in Cooperative MARL with the Transformation
  And Distillation Framework
Towards Global Optimality in Cooperative MARL with the Transformation And Distillation Framework
Jianing Ye
Chenghao Li
Jianhao Wang
Chongjie Zhang
37
2
0
12 Jul 2022
The Power of Regularization in Solving Extensive-Form Games
The Power of Regularization in Solving Extensive-Form Games
Ming-Yu Liu
Asuman Ozdaglar
Tiancheng Yu
K. Zhang
14
20
0
19 Jun 2022
Convergence and Price of Anarchy Guarantees of the Softmax Policy
  Gradient in Markov Potential Games
Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games
Dingyang Chen
Qi Zhang
Thinh T. Doan
15
12
0
15 Jun 2022
Achieving Zero Constraint Violation for Constrained Reinforcement
  Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Qinbo Bai
Amrit Singh Bedi
Vaneet Aggarwal
21
20
0
12 Jun 2022
Previous
1234
Next