Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.16394
Cited By
Sharper Model-free Reinforcement Learning for Average-reward Markov Decision Processes
28 June 2023
Zihan Zhang
Qiaomin Xie
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sharper Model-free Reinforcement Learning for Average-reward Markov Decision Processes"
15 / 15 papers shown
Title
Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm
Sattar Vakili
Julia Olkhovskaya
21
0
0
30 Oct 2024
Learning Infinite-Horizon Average-Reward Linear Mixture MDPs of Bounded Span
Woojin Chae
Kihyuk Hong
Yufan Zhang
Ambuj Tewari
Dabeen Lee
17
1
0
19 Oct 2024
Optimistic Q-learning for average reward and episodic reinforcement learning
Priyank Agrawal
Shipra Agrawal
35
3
0
18 Jul 2024
Achieving Tractable Minimax Optimal Regret in Average Reward MDPs
Victor Boone
Zihan Zhang
24
5
0
03 Jun 2024
Finding good policies in average-reward Markov Decision Processes without prior knowledge
Adrienne Tuynman
Rémy Degenne
Emilie Kaufmann
17
2
0
27 May 2024
Reinforcement Learning for Infinite-Horizon Average-Reward Linear MDPs via Approximation by Discounted-Reward MDPs
Kihyuk Hong
Yufan Zhang
Ambuj Tewari
Dabeen Lee
Ambuj Tewari
22
2
0
23 May 2024
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
Jianliang He
Han Zhong
Zhuoran Yang
21
6
0
19 Apr 2024
Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs
Swetha Ganesh
Washim Uddin Mondal
Vaneet Aggarwal
39
3
0
02 Apr 2024
Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs
M. Zurek
Yudong Chen
13
3
0
18 Mar 2024
Provable Policy Gradient Methods for Average-Reward Markov Potential Games
Min Cheng
Ruida Zhou
P. R. Kumar
Chao Tian
46
2
0
09 Mar 2024
Span-Based Optimal Sample Complexity for Average Reward MDPs
M. Zurek
Yudong Chen
26
6
0
22 Nov 2023
Optimal Sample Complexity for Average Reward Markov Decision Processes
Shengbo Wang
Jose H. Blanchet
Peter Glynn
17
8
0
13 Oct 2023
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs
Ian A. Kash
L. Reyzin
Zishun Yu
29
0
0
18 May 2022
Stochastic first-order methods for average-reward Markov decision processes
Tianjiao Li
Feiyang Wu
Guanghui Lan
9
13
0
11 May 2022
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
103
99
0
15 Oct 2019
1