ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.11925
  4. Cited By
Towards Global Optimality for Practical Average Reward Reinforcement
  Learning without Mixing Time Oracles

Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles

18 March 2024
Bhrij Patel
Wesley A. Suttle
Alec Koppel
Vaneet Aggarwal
Brian M. Sadler
Amrit Singh Bedi
Dinesh Manocha
ArXivPDFHTML

Papers citing "Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles"

4 / 4 papers shown
Title
Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs
Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs
Swetha Ganesh
Washim Uddin Mondal
Vaneet Aggarwal
39
3
0
02 Apr 2024
Adapting to Mixing Time in Stochastic Optimization with Markovian Data
Adapting to Mixing Time in Stochastic Optimization with Markovian Data
Ron Dorfman
Kfir Y. Levy
30
28
0
09 Feb 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M. Sadler
Pratap Tokekar
Alec Koppel
36
17
0
28 Jan 2022
On the Sample Complexity of Actor-Critic Method for Reinforcement
  Learning with Function Approximation
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
99
78
0
18 Oct 2019
1