Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.11925
Cited By
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles
18 March 2024
Bhrij Patel
Wesley A. Suttle
Alec Koppel
Vaneet Aggarwal
Brian M. Sadler
Amrit Singh Bedi
Dinesh Manocha
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles"
4 / 4 papers shown
Title
Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs
Swetha Ganesh
Washim Uddin Mondal
Vaneet Aggarwal
39
3
0
02 Apr 2024
Adapting to Mixing Time in Stochastic Optimization with Markovian Data
Ron Dorfman
Kfir Y. Levy
30
28
0
09 Feb 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M. Sadler
Pratap Tokekar
Alec Koppel
36
17
0
28 Jan 2022
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
99
78
0
18 Oct 2019
1