Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.02140
Cited By
Discounted Reinforcement Learning Is Not an Optimization Problem
4 October 2019
T. Reiher
R. Shariff
J. Castrillón
Hengshuai Yao
R. Sutton
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Discounted Reinforcement Learning Is Not an Optimization Problem"
23 / 23 papers shown
Title
Planning and Learning in Average Risk-aware MDPs
Weikai Wang
Erick Delage
55
0
0
22 Mar 2025
Average-Reward Reinforcement Learning with Entropy Regularization
Jacob Adamczyk
Volodymyr Makarenko
Stas Tiomkin
R. Kulkarni
OOD
61
2
0
17 Jan 2025
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
Yukinari Hisaki
Isao Ono
29
2
0
04 Aug 2024
Reward Centering
Abhishek Naik
Yi Wan
Manan Tomar
Richard S. Sutton
21
6
0
16 May 2024
Feint in Multi-Player Games
Junyu Liu
Wangkai Jin
Xiangjun Peng
OffRL
30
0
0
04 Mar 2024
Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement Learning Approach
Xingqiu He
Chaoqun You
Tony Q.S. Quek
25
8
0
01 Dec 2023
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
35
6
0
30 Sep 2023
Reward Function Design for Crowd Simulation via Reinforcement Learning
Ariel Kwiatkowski
Vicky Kalogeiton
Julien Pettré
Marie-Paule Cani
21
2
0
22 Sep 2023
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
Brahma S. Pavse
M. Zurek
Yudong Chen
Qiaomin Xie
Josiah P. Hanna
OffRL
33
1
0
02 Jun 2023
On the Convergence of Discounted Policy Gradient Methods
Chris Nota
18
0
0
28 Dec 2022
Posterior Sampling for Continuing Environments
Wanqiao Xu
Shi Dong
Benjamin Van Roy
10
2
0
29 Nov 2022
Challenges to Solving Combinatorially Hard Long-Horizon Deep RL Tasks
Andrew C. Li
Pashootan Vaezipoor
Rodrigo Toro Icarte
Sheila A. McIlraith
OffRL
LRM
19
4
0
03 Jun 2022
Influencing Long-Term Behavior in Multiagent Reinforcement Learning
Dong-Ki Kim
Matthew D Riemer
Miao Liu
Jakob N. Foerster
Michael Everett
Chuangchuang Sun
Gerald Tesauro
Jonathan P. How
24
0
0
07 Mar 2022
Continual Learning In Environments With Polynomial Mixing Times
Matthew D Riemer
Sharath Chandra Raparthy
Ignacio Cases
G. Subbaraj
M. P. Touzel
Irina Rish
CLL
41
8
0
13 Dec 2021
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Yiming Zhang
Keith Ross
OffRL
33
40
0
14 Jun 2021
Lifetime policy reuse and the importance of task capacity
David M. Bossens
A. J. Sobey
CLL
OffRL
8
3
0
03 Jun 2021
Planning with Expectation Models for Control
Katya Kudashkina
Yi Wan
Abhishek Naik
R. Sutton
OffRL
20
0
0
17 Apr 2021
Optimizing the Long-Term Average Reward for Continuing MDPs: A Technical Report
Chao Xu
Yiping Xie
Xijun Wang
H. Yang
Dusit Niyato
Tony Q.S. Quek
22
3
0
13 Apr 2021
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
S. Murphy
OffRL
34
81
0
23 Jul 2020
Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks
Yuqian Jiang
Suda Bharadwaj
Bo Wu
Rishi Shah
Ufuk Topcu
Peter Stone
CLL
OffRL
LRM
9
41
0
03 Jul 2020
Learning and Planning in Average-Reward Markov Decision Processes
Yi Wan
A. Naik
R. Sutton
OffRL
8
55
0
29 Jun 2020
A reinforcement learning approach to rare trajectory sampling
Dominic C. Rose
Jamie F. Mair
J. P. Garrahan
13
51
0
26 May 2020
Data-driven control of micro-climate in buildings: an event-triggered reinforcement learning approach
A. H. Hosseinloo
Alexander Ryzhov
A. Bischi
H. Ouerdane
K. Turitsyn
M. Dahleh
AI4CE
14
41
0
28 Jan 2020
1