Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.11270
Cited By
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
22 February 2021
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Softmax Policy Gradient Methods Can Take Exponential Time to Converge"
16 / 16 papers shown
Title
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Jincheng Mei
Bo Dai
Alekh Agarwal
Mohammad Ghavamzadeh
Csaba Szepesvári
Dale Schuurmans
55
4
0
02 Apr 2025
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
82
0
0
11 Feb 2025
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Alessandro Montenegro
Marco Mussi
Alberto Maria Metelli
Matteo Papini
38
2
0
03 May 2024
A Large Deviations Perspective on Policy Gradient Algorithms
Wouter Jongeneel
Daniel Kuhn
Mengmeng Li
11
1
0
13 Nov 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
21
0
0
22 Mar 2023
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search
Gal Dalal
Assaf Hallak
Gugan Thoppe
Shie Mannor
Gal Chechik
22
3
0
30 Jan 2023
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
11
15
0
16 Jan 2023
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
48
35
0
03 Oct 2022
Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games
Dingyang Chen
Qi Zhang
Thinh T. Doan
10
12
0
15 Jun 2022
Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization
Shicong Cen
Fan Chen
Yuejie Chi
16
15
0
12 Apr 2022
Accelerating Primal-dual Methods for Regularized Markov Decision Processes
Haoya Li
Hsiang-Fu Yu
Lexing Ying
Inderjit Dhillon
13
4
0
21 Feb 2022
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
14
11
0
18 Jan 2022
Approximate Newton policy gradient algorithms
Haoya Li
Samarth Gupta
Hsiangfu Yu
Lexing Ying
Inderjit Dhillon
28
2
0
05 Oct 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
16
6
0
13 Sep 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
87
136
0
30 Jan 2021
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
88
145
0
04 May 2020
1