ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.16318
  4. Cited By
Learning and Planning in Average-Reward Markov Decision Processes

Learning and Planning in Average-Reward Markov Decision Processes

29 June 2020
Yi Wan
A. Naik
R. Sutton
    OffRL
ArXivPDFHTML

Papers citing "Learning and Planning in Average-Reward Markov Decision Processes"

11 / 11 papers shown
Title
Reinforcement Teaching
Reinforcement Teaching
Alex Lewandowski
Calarina Muslimani
Dale Schuurmans
Matthew E. Taylor
Jun Luo
81
1
0
28 Jan 2025
An Empirical Study of Deep Reinforcement Learning in Continuing Tasks
An Empirical Study of Deep Reinforcement Learning in Continuing Tasks
Yi Wan
D. Korenkevych
Zheqing Zhu
OffRL
CLL
52
0
0
12 Jan 2025
Stochastic Halpern iteration in normed spaces and applications to reinforcement learning
Stochastic Halpern iteration in normed spaces and applications to reinforcement learning
Mario Bravo
Juan Pablo Contreras
48
3
0
19 Mar 2024
Full Gradient Deep Reinforcement Learning for Average-Reward Criterion
Full Gradient Deep Reinforcement Learning for Average-Reward Criterion
Tejas Pagare
Vivek Borkar
Konstantin Avrachenkov
29
4
0
07 Apr 2023
Performance Bounds for Policy-Based Average Reward Reinforcement
  Learning Algorithms
Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms
Yashaswini Murthy
Mehrdad Moharrami
R. Srikant
OffRL
32
5
0
02 Feb 2023
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Akhil Agnihotri
R. Jain
Haipeng Luo
26
2
0
02 Feb 2023
Robust Average-Reward Markov Decision Processes
Robust Average-Reward Markov Decision Processes
Yue Wang
Alvaro Velasquez
George Atia
Ashley Prater-Bennette
Shaofeng Zou
39
12
0
02 Jan 2023
On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly
  Communicating MDPs
On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly Communicating MDPs
Yi Wan
R. Sutton
9
3
0
30 Sep 2022
Continual Learning In Environments With Polynomial Mixing Times
Continual Learning In Environments With Polynomial Mixing Times
Matthew D Riemer
Sharath Chandra Raparthy
Ignacio Cases
G. Subbaraj
M. P. Touzel
Irina Rish
CLL
41
8
0
13 Dec 2021
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Yiming Zhang
Keith Ross
OffRL
33
40
0
14 Jun 2021
Average-Reward Reinforcement Learning with Trust Region Methods
Average-Reward Reinforcement Learning with Trust Region Methods
Xiaoteng Ma
Xiao-Jing Tang
Li Xia
Jun Yang
Qianchuan Zhao
24
16
0
07 Jun 2021
1