ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.02140
  4. Cited By
Discounted Reinforcement Learning Is Not an Optimization Problem

Discounted Reinforcement Learning Is Not an Optimization Problem

4 October 2019
T. Reiher
R. Shariff
J. Castrillón
Hengshuai Yao
R. Sutton
ArXivPDFHTML

Papers citing "Discounted Reinforcement Learning Is Not an Optimization Problem"

23 / 23 papers shown
Title
Planning and Learning in Average Risk-aware MDPs
Planning and Learning in Average Risk-aware MDPs
Weikai Wang
Erick Delage
55
0
0
22 Mar 2025
Average-Reward Reinforcement Learning with Entropy Regularization
Average-Reward Reinforcement Learning with Entropy Regularization
Jacob Adamczyk
Volodymyr Makarenko
Stas Tiomkin
R. Kulkarni
OOD
61
2
0
17 Jan 2025
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
Yukinari Hisaki
Isao Ono
29
2
0
04 Aug 2024
Reward Centering
Reward Centering
Abhishek Naik
Yi Wan
Manan Tomar
Richard S. Sutton
21
6
0
16 May 2024
Feint in Multi-Player Games
Feint in Multi-Player Games
Junyu Liu
Wangkai Jin
Xiangjun Peng
OffRL
30
0
0
04 Mar 2024
Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement
  Learning Approach
Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement Learning Approach
Xingqiu He
Chaoqun You
Tony Q.S. Quek
25
8
0
01 Dec 2023
Consistent Aggregation of Objectives with Diverse Time Preferences
  Requires Non-Markovian Rewards
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
35
6
0
30 Sep 2023
Reward Function Design for Crowd Simulation via Reinforcement Learning
Reward Function Design for Crowd Simulation via Reinforcement Learning
Ariel Kwiatkowski
Vicky Kalogeiton
Julien Pettré
Marie-Paule Cani
21
2
0
22 Sep 2023
Learning to Stabilize Online Reinforcement Learning in Unbounded State
  Spaces
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
Brahma S. Pavse
M. Zurek
Yudong Chen
Qiaomin Xie
Josiah P. Hanna
OffRL
33
1
0
02 Jun 2023
On the Convergence of Discounted Policy Gradient Methods
On the Convergence of Discounted Policy Gradient Methods
Chris Nota
18
0
0
28 Dec 2022
Posterior Sampling for Continuing Environments
Posterior Sampling for Continuing Environments
Wanqiao Xu
Shi Dong
Benjamin Van Roy
10
2
0
29 Nov 2022
Challenges to Solving Combinatorially Hard Long-Horizon Deep RL Tasks
Challenges to Solving Combinatorially Hard Long-Horizon Deep RL Tasks
Andrew C. Li
Pashootan Vaezipoor
Rodrigo Toro Icarte
Sheila A. McIlraith
OffRL
LRM
19
4
0
03 Jun 2022
Influencing Long-Term Behavior in Multiagent Reinforcement Learning
Influencing Long-Term Behavior in Multiagent Reinforcement Learning
Dong-Ki Kim
Matthew D Riemer
Miao Liu
Jakob N. Foerster
Michael Everett
Chuangchuang Sun
Gerald Tesauro
Jonathan P. How
24
0
0
07 Mar 2022
Continual Learning In Environments With Polynomial Mixing Times
Continual Learning In Environments With Polynomial Mixing Times
Matthew D Riemer
Sharath Chandra Raparthy
Ignacio Cases
G. Subbaraj
M. P. Touzel
Irina Rish
CLL
41
8
0
13 Dec 2021
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Yiming Zhang
Keith Ross
OffRL
33
40
0
14 Jun 2021
Lifetime policy reuse and the importance of task capacity
Lifetime policy reuse and the importance of task capacity
David M. Bossens
A. J. Sobey
CLL
OffRL
8
3
0
03 Jun 2021
Planning with Expectation Models for Control
Planning with Expectation Models for Control
Katya Kudashkina
Yi Wan
Abhishek Naik
R. Sutton
OffRL
20
0
0
17 Apr 2021
Optimizing the Long-Term Average Reward for Continuing MDPs: A Technical
  Report
Optimizing the Long-Term Average Reward for Continuing MDPs: A Technical Report
Chao Xu
Yiping Xie
Xijun Wang
H. Yang
Dusit Niyato
Tony Q.S. Quek
22
3
0
13 Apr 2021
Batch Policy Learning in Average Reward Markov Decision Processes
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
S. Murphy
OffRL
34
81
0
23 Jul 2020
Temporal-Logic-Based Reward Shaping for Continuing Reinforcement
  Learning Tasks
Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks
Yuqian Jiang
Suda Bharadwaj
Bo Wu
Rishi Shah
Ufuk Topcu
Peter Stone
CLL
OffRL
LRM
9
41
0
03 Jul 2020
Learning and Planning in Average-Reward Markov Decision Processes
Learning and Planning in Average-Reward Markov Decision Processes
Yi Wan
A. Naik
R. Sutton
OffRL
8
55
0
29 Jun 2020
A reinforcement learning approach to rare trajectory sampling
A reinforcement learning approach to rare trajectory sampling
Dominic C. Rose
Jamie F. Mair
J. P. Garrahan
13
51
0
26 May 2020
Data-driven control of micro-climate in buildings: an event-triggered
  reinforcement learning approach
Data-driven control of micro-climate in buildings: an event-triggered reinforcement learning approach
A. H. Hosseinloo
Alexander Ryzhov
A. Bischi
H. Ouerdane
K. Turitsyn
M. Dahleh
AI4CE
14
41
0
28 Jan 2020
1