ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.02827
  4. Cited By
Inverse Reward Design
v1v2 (latest)

Inverse Reward Design

8 November 2017
Dylan Hadfield-Menell
S. Milli
Pieter Abbeel
Stuart J. Russell
Anca Dragan
ArXiv (abs)PDFHTML

Papers citing "Inverse Reward Design"

50 / 265 papers shown
Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State
  Entropy Estimate
Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate
Mirco Mutti
Lorenzo Pratissoli
Marcello Restelli
229
21
0
09 Jul 2020
Mitigating undesirable emergent behavior arising between driver and
  semi-automated vehicle
Mitigating undesirable emergent behavior arising between driver and semi-automated vehicle
Timo Melman
N. Beckers
David A. Abbink
37
0
0
30 Jun 2020
Open Questions in Creating Safe Open-ended AI: Tensions Between Control
  and Creativity
Open Questions in Creating Safe Open-ended AI: Tensions Between Control and CreativityIEEE Symposium on Artificial Life (AL), 2020
Adrien Ecoffet
Jeff Clune
Joel Lehman
167
16
0
12 Jun 2020
Avoiding Side Effects in Complex Environments
Avoiding Side Effects in Complex EnvironmentsNeural Information Processing Systems (NeurIPS), 2020
Alexander Matt Turner
Neale Ratzlaff
Prasad Tadepalli
308
38
0
11 Jun 2020
Emergent Real-World Robotic Skills via Unsupervised Off-Policy
  Reinforcement Learning
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning
Archit Sharma
Michael Ahn
Sergey Levine
Vikash Kumar
Karol Hausman
S. Gu
SSLOffRL
130
54
0
27 Apr 2020
Weakly-Supervised Reinforcement Learning for Controllable Behavior
Weakly-Supervised Reinforcement Learning for Controllable BehaviorNeural Information Processing Systems (NeurIPS), 2020
Lisa Lee
Benjamin Eysenbach
Ruslan Salakhutdinov
S. Gu
Chelsea Finn
SSL
242
26
0
06 Apr 2020
Improving Confidence in the Estimation of Values and Norms
Improving Confidence in the Estimation of Values and Norms
Luciano Cavalcante Siebert
Rijk Mercuur
Virginia Dignum
J. van den Hoven
Catholijn M. Jonker
92
0
0
02 Apr 2020
An empirical investigation of the challenges of real-world reinforcement
  learning
An empirical investigation of the challenges of real-world reinforcement learning
Gabriel Dulac-Arnold
Nir Levine
D. Mankowitz
Jerry Li
Cosmin Paduraru
Sven Gowal
Todd Hester
OffRL
365
130
0
24 Mar 2020
Rewriting History with Inverse RL: Hindsight Inference for Policy
  Improvement
Rewriting History with Inverse RL: Hindsight Inference for Policy ImprovementNeural Information Processing Systems (NeurIPS), 2020
Benjamin Eysenbach
Xinyang Geng
Sergey Levine
Ruslan Salakhutdinov
OffRL
252
92
0
25 Feb 2020
Safe Imitation Learning via Fast Bayesian Reward Inference from
  Preferences
Safe Imitation Learning via Fast Bayesian Reward Inference from PreferencesInternational Conference on Machine Learning (ICML), 2020
Daniel S. Brown
Russell Coleman
R. Srinivasan
S. Niekum
BDL
408
110
0
21 Feb 2020
Reward-rational (implicit) choice: A unifying formalism for reward
  learning
Reward-rational (implicit) choice: A unifying formalism for reward learningNeural Information Processing Systems (NeurIPS), 2020
Hong Jun Jeon
S. Milli
Anca Dragan
354
194
0
12 Feb 2020
Quantifying Hypothesis Space Misspecification in Learning from
  Human-Robot Demonstrations and Physical Corrections
Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical CorrectionsIEEE Transactions on robotics (IEEE Trans. Robot.), 2020
Andreea Bobu
Andrea V. Bajcsy
J. F. Fisac
Sampada Deglurkar
Anca Dragan
169
42
0
03 Feb 2020
Point-Based Methods for Model Checking in Partially Observable Markov
  Decision Processes
Point-Based Methods for Model Checking in Partially Observable Markov Decision ProcessesAAAI Conference on Artificial Intelligence (AAAI), 2020
Maxime Bouton
Jana Tumova
Mykel J. Kochenderfer
175
32
0
11 Jan 2020
Towards Practical Multi-Object Manipulation using Relational
  Reinforcement Learning
Towards Practical Multi-Object Manipulation using Relational Reinforcement LearningIEEE International Conference on Robotics and Automation (ICRA), 2019
R. Li
Allan Jabri
Trevor Darrell
Pulkit Agrawal
OffRL
142
118
0
23 Dec 2019
Relational Mimic for Visual Adversarial Imitation Learning
Relational Mimic for Visual Adversarial Imitation Learning
Lionel Blondé
Yichuan Tang
Jian Zhang
Russ Webb
113
0
0
18 Dec 2019
Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement
  Learning
Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning
Riashat Islam
Raihan Seraj
Samin Yeasar Arnob
Doina Precup
OffRL
146
3
0
11 Dec 2019
Unsupervised Curricula for Visual Meta-Reinforcement Learning
Unsupervised Curricula for Visual Meta-Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2019
Allan Jabri
Kyle Hsu
Benjamin Eysenbach
Abhishek Gupta
Sergey Levine
Chelsea Finn
VLMOODSSLOffRL
165
66
0
09 Dec 2019
Learning Human Objectives by Evaluating Hypothetical Behavior
Learning Human Objectives by Evaluating Hypothetical BehaviorInternational Conference on Machine Learning (ICML), 2019
S. Reddy
Anca Dragan
Sergey Levine
Shane Legg
Jan Leike
238
78
0
05 Dec 2019
Rationally Inattentive Inverse Reinforcement Learning Explains YouTube
  Commenting Behavior
Rationally Inattentive Inverse Reinforcement Learning Explains YouTube Commenting BehaviorJournal of machine learning research (JMLR), 2019
William Hoiles
Vikram Krishnamurthy
Kunal Pattanayak
CML
214
27
0
24 Oct 2019
Reward Tampering Problems and Solutions in Reinforcement Learning: A
  Causal Influence Diagram Perspective
Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective
Tom Everitt
Marcus Hutter
Ramana Kumar
Victoria Krakovna
418
121
0
13 Aug 2019
Multi-Agent Adversarial Inverse Reinforcement Learning
Multi-Agent Adversarial Inverse Reinforcement LearningInternational Conference on Machine Learning (ICML), 2019
Lantao Yu
Jiaming Song
Stefano Ermon
326
159
0
30 Jul 2019
Towards Empathic Deep Q-Learning
Towards Empathic Deep Q-Learning
Bart Bussmann
Jacqueline Heinerman
Joel Lehman
AI4CE
145
12
0
26 Jun 2019
Evolutionary Computation and AI Safety: Research Problems Impeding
  Routine and Safe Real-world Application of Evolution
Evolutionary Computation and AI Safety: Research Problems Impeding Routine and Safe Real-world Application of EvolutionGenetic Programming Theory and Practice (GPTP), 2019
Joel Lehman
137
7
0
24 Jun 2019
On the Feasibility of Learning, Rather than Assuming, Human Biases for
  Reward Inference
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward InferenceInternational Conference on Machine Learning (ICML), 2019
Rohin Shah
Noah Gundotra
Pieter Abbeel
Anca Dragan
151
74
0
23 Jun 2019
Learning Reward Functions by Integrating Human Demonstrations and
  Preferences
Learning Reward Functions by Integrating Human Demonstrations and Preferences
Malayandi Palan
Nicholas C. Landolfi
Gleb Shevchuk
Dorsa Sadigh
142
139
0
21 Jun 2019
Planning With Uncertain Specifications (PUnS)
Planning With Uncertain Specifications (PUnS)IEEE Robotics and Automation Letters (RA-L), 2019
Ankit J. Shah
Shen Li
J. Shah
186
25
0
07 Jun 2019
Sequence Modeling of Temporal Credit Assignment for Episodic
  Reinforcement Learning
Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning
Yang Liu
Yunan Luo
Yuanyi Zhong
Xi Chen
Qiang Liu
Jian-wei Peng
182
40
0
31 May 2019
Defining Admissible Rewards for High Confidence Policy Evaluation
Defining Admissible Rewards for High Confidence Policy EvaluationACM Conference on Health, Inference, and Learning (CHIL), 2019
Niranjani Prasad
Barbara E. Engelhardt
Finale Doshi-Velez
OffRL
140
7
0
30 May 2019
Minimizing the Negative Side Effects of Planning with Reduced Models
Minimizing the Negative Side Effects of Planning with Reduced Models
Sandhya Saisubramanian
S. Zilberstein
71
0
0
22 May 2019
Challenges of Real-World Reinforcement Learning
Challenges of Real-World Reinforcement Learning
Gabriel Dulac-Arnold
D. Mankowitz
Todd Hester
OffRL
303
618
0
29 Apr 2019
Energy-Based Continuous Inverse Optimal Control
Energy-Based Continuous Inverse Optimal Control
Yifei Xu
Jianwen Xie
Tianyang Zhao
Chris L. Baker
Yibiao Zhao
Ying Nian Wu
488
20
0
10 Apr 2019
Deep Imitation Learning for Autonomous Driving in Generic Urban
  Scenarios with Enhanced Safety
Deep Imitation Learning for Autonomous Driving in Generic Urban Scenarios with Enhanced SafetyIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2019
Jianyu Chen
Bodi Yuan
Masayoshi Tomizuka
155
147
0
02 Mar 2019
Conservative Agency via Attainable Utility Preservation
Conservative Agency via Attainable Utility Preservation
Alexander Matt Turner
Dylan Hadfield-Menell
Prasad Tadepalli
373
53
0
26 Feb 2019
Parenting: Safe Reinforcement Learning from Human Input
Parenting: Safe Reinforcement Learning from Human Input
Christopher Frye
Ilya Feige
149
9
0
18 Feb 2019
Preferences Implicit in the State of the World
Preferences Implicit in the State of the WorldInternational Conference on Learning Representations (ICLR), 2019
Rohin Shah
Dmitrii Krasheninnikov
Jordan Alexander
Pieter Abbeel
Anca Dragan
270
57
0
12 Feb 2019
Deep Reinforcement Learning for Multi-Agent Systems: A Review of
  Challenges, Solutions and Applications
Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications
Thanh Thi Nguyen
Ngoc Duy Nguyen
S. Nahavandi
318
939
0
31 Dec 2018
Estimating Rationally Inattentive Utility Functions with Deep Clustering
  for Framing - Applications in YouTube Engagement Dynamics
Estimating Rationally Inattentive Utility Functions with Deep Clustering for Framing - Applications in YouTube Engagement Dynamics
William Hoiles
Vikram Krishnamurthy
72
0
0
23 Dec 2018
Scalable agent alignment via reward modeling: a research direction
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
378
527
0
19 Nov 2018
Legible Normativity for AI Alignment: The Value of Silly Rules
Legible Normativity for AI Alignment: The Value of Silly Rules
Dylan Hadfield-Menell
Mckane Andrus
Gillian K. Hadfield
112
20
0
03 Nov 2018
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLMOffRL
367
143
0
15 Oct 2018
Learning under Misspecified Objective Spaces
Learning under Misspecified Objective Spaces
Andreea Bobu
Andrea V. Bajcsy
J. F. Fisac
Anca Dragan
293
32
0
11 Oct 2018
Reinforcement Learning with Perturbed Rewards
Reinforcement Learning with Perturbed Rewards
Jingkang Wang
Yang Liu
Yue Liu
NoLa
395
151
0
02 Oct 2018
Active Inverse Reward Design
Active Inverse Reward Design
Sören Mindermann
Rohin Shah
Adam Gleave
Dylan Hadfield-Menell
261
20
0
09 Sep 2018
Multi-Agent Generative Adversarial Imitation Learning
Multi-Agent Generative Adversarial Imitation Learning
Jiaming Song
Hongyu Ren
Dorsa Sadigh
Stefano Ermon
GAN
203
249
0
26 Jul 2018
Safe Option-Critic: Learning Safety in the Option-Critic Architecture
Safe Option-Critic: Learning Safety in the Option-Critic Architecture
Arushi Jain
Khimya Khetarpal
Doina Precup
254
28
0
21 Jul 2018
VFunc: a Deep Generative Model for Functions
VFunc: a Deep Generative Model for Functions
Philip Bachman
Riashat Islam
Alessandro Sordoni
Zafarali Ahmed
VLMBDL
151
8
0
11 Jul 2018
Towards Mixed Optimization for Reinforcement Learning with Program
  Synthesis
Towards Mixed Optimization for Reinforcement Learning with Program Synthesis
Surya Bhupatiraju
Kumar Krishna Agrawal
Rishabh Singh
137
8
0
01 Jul 2018
Simplifying Reward Design through Divide-and-Conquer
Simplifying Reward Design through Divide-and-Conquer
Ellis Ratner
Dylan Hadfield-Menell
Anca Dragan
149
30
0
07 Jun 2018
Including Uncertainty when Learning from Human Corrections
Including Uncertainty when Learning from Human Corrections
Dylan P. Losey
M. O'Malley
138
34
0
06 Jun 2018
Penalizing side effects using stepwise relative reachability
Penalizing side effects using stepwise relative reachability
Victoria Krakovna
Laurent Orseau
Ramana Kumar
Miljan Martic
Shane Legg
287
59
0
04 Jun 2018
Previous
123456
Next
Page 5 of 6