ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.11622
  4. Cited By
An Alternate Policy Gradient Estimator for Softmax Policies

An Alternate Policy Gradient Estimator for Softmax Policies

22 December 2021
Shivam Garg
Samuele Tosatto
Yangchen Pan
Martha White
A. R. Mahmood
ArXivPDFHTML

Papers citing "An Alternate Policy Gradient Estimator for Softmax Policies"

2 / 2 papers shown
Title
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
87
0
0
11 Feb 2025
Examining Policy Entropy of Reinforcement Learning Agents for
  Personalization Tasks
Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks
Anton Dereventsov
Andrew Starnes
Clayton Webster
18
4
0
21 Nov 2022
1