Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization

19 October 2021

Papers citing "Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization"

6 / 6 papers shown

Title
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates Jincheng Mei Bo Dai Alekh Agarwal Sharan Vaswani Anant Raj Csaba Szepesvári Dale Schuurmans 82 0 0 11 Feb 2025
A Large Deviations Perspective on Policy Gradient Algorithms Wouter Jongeneel Daniel Kuhn Mengmeng Li 11 1 0 13 Nov 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality François Ged M. H. Veiga 21 0 0 22 Mar 2023
A general sample complexity analysis of vanilla policy gradient Rui Yuan Robert Mansel Gower A. Lazaric 66 62 0 23 Jul 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method Junyu Zhang Chengzhuo Ni Zheng Yu Csaba Szepesvári Mengdi Wang 44 66 0 17 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes Guanghui Lan 87 135 0 30 Jan 2021