ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.00876
  4. Cited By
On the Expressivity of Markov Reward

On the Expressivity of Markov Reward

1 November 2021
David Abel
Will Dabney
A. Harutyunyan
Mark K. Ho
Michael L. Littman
Doina Precup
Satinder Singh
ArXivPDFHTML

Papers citing "On the Expressivity of Markov Reward"

50 / 50 papers shown
Title
Causally Aligned Curriculum Learning
Causally Aligned Curriculum Learning
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
CML
56
3
0
21 Mar 2025
Limits of specifiability for sensor-based robotic planning tasks
Basak Sakcak
Dylan A. Shell
J. O’Kane
29
0
0
07 Mar 2025
Rethinking Inverse Reinforcement Learning: from Data Alignment to Task
  Alignment
Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment
Weichao Zhou
Wenchao Li
26
0
0
31 Oct 2024
Reinforcement Learning with LTL and $ω$-Regular Objectives via
  Optimality-Preserving Translation to Average Rewards
Reinforcement Learning with LTL and ωωω-Regular Objectives via Optimality-Preserving Translation to Average Rewards
Xuan-Bach Le
Dominik Wagner
Leon Witzman
Alexander Rabinovich
Luke Ong
9
2
0
16 Oct 2024
Directed Exploration in Reinforcement Learning from Linear Temporal
  Logic
Directed Exploration in Reinforcement Learning from Linear Temporal Logic
Marco Bagatella
Andreas Krause
Georg Martius
OffRL
31
1
0
18 Aug 2024
Three Dogmas of Reinforcement Learning
Three Dogmas of Reinforcement Learning
David Abel
Mark K. Ho
A. Harutyunyan
36
5
0
15 Jul 2024
LTL-Constrained Policy Optimization with Cycle Experience Replay
LTL-Constrained Policy Optimization with Cycle Experience Replay
Ameesh Shah
Cameron Voloshin
Chenxi Yang
Abhinav Verma
Swarat Chaudhuri
S. Seshia
29
1
0
17 Apr 2024
$L^*LM$: Learning Automata from Examples using Natural Language Oracles
L∗LML^*LML∗LM: Learning Automata from Examples using Natural Language Oracles
Marcell Vazquez-Chanlatte
Karim Elmaaroufi
Stefan J. Witwicki
S. Seshia
19
4
0
10 Feb 2024
On the Limitations of Markovian Rewards to Express Multi-Objective,
  Risk-Sensitive, and Modal Tasks
On the Limitations of Markovian Rewards to Express Multi-Objective, Risk-Sensitive, and Modal Tasks
Joar Skalse
Alessandro Abate
19
9
0
26 Jan 2024
Detecting Hidden Triggers: Mapping Non-Markov Reward Functions to Markov
Detecting Hidden Triggers: Mapping Non-Markov Reward Functions to Markov
Gregory Hyde
Eugene Santos
11
0
0
20 Jan 2024
Is Feedback All You Need? Leveraging Natural Language Feedback in
  Goal-Conditioned Reinforcement Learning
Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement Learning
Sabrina McCallum
Max Taylor-Davies
Stefano V. Albrecht
Alessandro Suglia
21
1
0
07 Dec 2023
Signal Temporal Logic-Guided Apprenticeship Learning
Signal Temporal Logic-Guided Apprenticeship Learning
Aniruddh Gopinath Puranic
Jyotirmoy V. Deshmukh
S. Nikolaidis
33
1
0
09 Nov 2023
Conditions on Preference Relations that Guarantee the Existence of
  Optimal Policies
Conditions on Preference Relations that Guarantee the Existence of Optimal Policies
Jonathan Colaco Carr
Prakash Panangaden
Doina Precup
26
1
0
03 Nov 2023
Social Contract AI: Aligning AI Assistants with Implicit Group Norms
Social Contract AI: Aligning AI Assistants with Implicit Group Norms
Jan-Philipp Fränken
Sam Kwok
Peixuan Ye
Kanishk Gandhi
Dilip Arumugam
Jared Moore
Alex Tamkin
Tobias Gerstenberg
Noah D. Goodman
24
7
0
26 Oct 2023
$f$-Policy Gradients: A General Framework for Goal Conditioned RL using
  $f$-Divergences
fff-Policy Gradients: A General Framework for Goal Conditioned RL using fff-Divergences
Siddhant Agarwal
Ishan Durugkar
Peter Stone
Amy Zhang
31
4
0
10 Oct 2023
Consistent Aggregation of Objectives with Diverse Time Preferences
  Requires Non-Markovian Rewards
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
35
5
0
30 Sep 2023
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov
P. DÓro
Shagun Sodhani
Roberta Raileanu
Pierre-Luc Bacon
Pascal Vincent
Amy Zhang
Mikael Henaff
LRM
LLMAG
26
54
0
29 Sep 2023
Submodular Reinforcement Learning
Submodular Reinforcement Learning
Manish Prajapat
Mojmír Mutný
M. Zeilinger
Andreas Krause
OffRL
26
12
0
25 Jul 2023
On the Expressivity of Multidimensional Markov Reward
On the Expressivity of Multidimensional Markov Reward
Shuwa Miura
8
4
0
22 Jul 2023
A Definition of Continual Reinforcement Learning
A Definition of Continual Reinforcement Learning
David Abel
André Barreto
Benjamin Van Roy
Doina Precup
H. V. Hasselt
Satinder Singh
CLL
20
70
0
20 Jul 2023
Learning non-Markovian Decision-Making from State-only Sequences
Learning non-Markovian Decision-Making from State-only Sequences
Aoyang Qin
Feng Gao
Qing Li
Song-Chun Zhu
Sirui Xie
28
9
0
27 Jun 2023
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Xiao Hu
Jianxiong Li
Xianyuan Zhan
Qing-Shan Jia
Ya-Qin Zhang
11
8
0
27 May 2023
A Reminder of its Brittleness: Language Reward Shaping May Hinder
  Learning for Instruction Following Agents
A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following Agents
Sukai Huang
N. Lipovetzky
Trevor Cohn
30
2
0
26 May 2023
Learning Rewards to Optimize Global Performance Metrics in Deep
  Reinforcement Learning
Learning Rewards to Optimize Global Performance Metrics in Deep Reinforcement Learning
Junqi Qian
Paul Weng
Chenmien Tan
34
1
0
16 Mar 2023
Eventual Discounting Temporal Logic Counterfactual Experience Replay
Eventual Discounting Temporal Logic Counterfactual Experience Replay
Cameron Voloshin
Abhinav Verma
Yisong Yue
OffRL
16
11
0
03 Mar 2023
The Provable Benefits of Unsupervised Data Sharing for Offline
  Reinforcement Learning
The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning
Haotian Hu
Yiqin Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
6
4
0
27 Feb 2023
Safe Deep Reinforcement Learning by Verifying Task-Level Properties
Safe Deep Reinforcement Learning by Verifying Task-Level Properties
Enrico Marchesini
Luca Marzari
Alessandro Farinelli
Chris Amato
OffRL
11
13
0
20 Feb 2023
A State Augmentation based approach to Reinforcement Learning from Human
  Preferences
A State Augmentation based approach to Reinforcement Learning from Human Preferences
Mudit Verma
Subbarao Kambhampati
25
2
0
17 Feb 2023
Aligning Robot and Human Representations
Aligning Robot and Human Representations
Andreea Bobu
Andi Peng
Pulkit Agrawal
Julie A. Shah
Anca D. Dragan
38
10
0
03 Feb 2023
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Qing-Shan Jia
Ya-Qin Zhang
OffRL
33
19
0
03 Feb 2023
Settling the Reward Hypothesis
Settling the Reward Hypothesis
Michael H. Bowling
John D. Martin
David Abel
Will Dabney
LRM
28
29
0
20 Dec 2022
Efficient Meta Reinforcement Learning for Preference-based Fast
  Adaptation
Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation
Zhizhou Ren
Anji Liu
Yitao Liang
Jian-wei Peng
Jianzhu Ma
27
9
0
20 Nov 2022
Language Control Diffusion: Efficiently Scaling through Space, Time, and
  Tasks
Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks
Edwin Zhang
Yujie Lu
William Wang
Amy Zhang
DiffM
LM&Ro
24
16
0
27 Oct 2022
Defining and Characterizing Reward Hacking
Defining and Characterizing Reward Hacking
Joar Skalse
Nikolaus H. R. Howe
Dmitrii Krasheninnikov
David M. Krueger
57
54
0
27 Sep 2022
Identifiability and generalizability from multiple experts in Inverse
  Reinforcement Learning
Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning
Paul Rolland
Luca Viano
Norman Schuerhoff
Boris Nikolov
V. Cevher
OffRL
37
13
0
22 Sep 2022
Minimum Description Length Control
Minimum Description Length Control
Theodore H. Moskovitz
Ta-Chu Kao
M. Sahani
M. Botvinick
18
1
0
17 Jul 2022
Utility Theory for Sequential Decision Making
Utility Theory for Sequential Decision Making
Mehran Shakerinava
Siamak Ravanbakhsh
27
7
0
27 Jun 2022
Guarantees for Epsilon-Greedy Reinforcement Learning with Function
  Approximation
Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
16
49
0
19 Jun 2022
How to talk so AI will learn: Instructions, descriptions, and autonomy
How to talk so AI will learn: Instructions, descriptions, and autonomy
T. Sumers
Robert D. Hawkins
Mark K. Ho
Thomas L. Griffiths
Dylan Hadfield-Menell
LM&Ro
26
20
0
16 Jun 2022
CoNSoLe: Convex Neural Symbolic Learning
CoNSoLe: Convex Neural Symbolic Learning
Haoran Li
Yang Weng
Hanghang Tong
16
9
0
01 Jun 2022
Designing Rewards for Fast Learning
Designing Rewards for Fast Learning
Henry Sowerby
Zhi-Hua Zhou
Michael L. Littman
16
13
0
30 May 2022
A Hierarchical Bayesian Approach to Inverse Reinforcement Learning with
  Symbolic Reward Machines
A Hierarchical Bayesian Approach to Inverse Reinforcement Learning with Symbolic Reward Machines
Weichao Zhou
Wenchao Li
BDL
15
11
0
20 Apr 2022
Learning Performance Graphs from Demonstrations via Task-Based
  Evaluations
Learning Performance Graphs from Demonstrations via Task-Based Evaluations
Aniruddh Gopinath Puranic
Jyotirmoy V. Deshmukh
S. Nikolaidis
OffRL
24
5
0
12 Apr 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
34
9
0
23 Feb 2022
Reward is not enough: can we liberate AI from the reinforcement learning
  paradigm?
Reward is not enough: can we liberate AI from the reinforcement learning paradigm?
Vacslav Glukhov
17
0
0
03 Feb 2022
Challenging Common Assumptions in Convex Reinforcement Learning
Challenging Common Assumptions in Convex Reinforcement Learning
Mirco Mutti
Ric De Santi
Piersilvio De Bartolomeis
Marcello Restelli
OffRL
24
21
0
03 Feb 2022
Direct Behavior Specification via Constrained Reinforcement Learning
Direct Behavior Specification via Constrained Reinforcement Learning
Julien Roy
Roger Girgis
Joshua Romoff
Pierre-Luc Bacon
C. Pal
9
33
0
22 Dec 2021
Demonstration Informed Specification Search
Demonstration Informed Specification Search
Marcell Vazquez-Chanlatte
Ameesh Shah
Gil Lederman
S. Seshia
21
3
0
20 Dec 2021
Learning Long-Term Reward Redistribution via Randomized Return
  Decomposition
Learning Long-Term Reward Redistribution via Randomized Return Decomposition
Zhizhou Ren
Ruihan Guo
Yuanshuo Zhou
Jian-wei Peng
11
34
0
26 Nov 2021
Multi-Agent Reinforcement Learning with Temporal Logic Specifications
Multi-Agent Reinforcement Learning with Temporal Logic Specifications
Lewis Hammond
Alessandro Abate
Julian Gutierrez
Michael Wooldridge
AI4CE
40
32
0
01 Feb 2021
1