ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.10229
  4. Cited By
Regret Analysis of a Markov Policy Gradient Algorithm for Multi-arm
  Bandits

Regret Analysis of a Markov Policy Gradient Algorithm for Multi-arm Bandits

20 July 2020
D. Denisov
N. Walton
ArXivPDFHTML

Papers citing "Regret Analysis of a Markov Policy Gradient Algorithm for Multi-arm Bandits"

2 / 2 papers shown
Title
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
87
0
0
11 Feb 2025
The Role of Baselines in Policy Gradient Optimization
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
24
15
0
16 Jan 2023
1