Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.11622
Cited By
An Alternate Policy Gradient Estimator for Softmax Policies
22 December 2021
Shivam Garg
Samuele Tosatto
Yangchen Pan
Martha White
A. R. Mahmood
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Alternate Policy Gradient Estimator for Softmax Policies"
2 / 2 papers shown
Title
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
87
0
0
11 Feb 2025
Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks
Anton Dereventsov
Andrew Starnes
Clayton Webster
18
4
0
21 Nov 2022
1