Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.04833
Cited By
Reward-rational (implicit) choice: A unifying formalism for reward learning
12 February 2020
Hong Jun Jeon
S. Milli
Anca Dragan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reward-rational (implicit) choice: A unifying formalism for reward learning"
30 / 30 papers shown
Title
Optimal Interactive Learning on the Job via Facility Location Planning
Shivam Vats
Michelle Zhao
Patrick Callaghan
Mingxi Jia
Maxim Likhachev
Oliver Kroemer
George Konidaris
29
0
0
01 May 2025
Contextual Online Uncertainty-Aware Preference Learning for Human Feedback
Nan Lu
Ethan X. Fang
Junwei Lu
102
0
0
27 Apr 2025
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
38
0
0
09 Jul 2024
Pareto-Optimal Learning from Preferences with Hidden Context
Ryan Boldi
Li Ding
Lee Spector
S. Niekum
62
6
0
21 Jun 2024
A Generalized Acquisition Function for Preference-based Reward Learning
Evan Ellis
Gaurav R. Ghosal
Stuart J. Russell
Anca Dragan
Erdem Biyik
29
1
0
09 Mar 2024
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
Rohin Shah
21
6
0
05 Dec 2023
Designing Fiduciary Artificial Intelligence
Sebastian Benthall
David Shekman
43
3
0
27 Jul 2023
Decision-Oriented Dialogue for Human-AI Collaboration
Jessy Lin
Nicholas Tomlin
Jacob Andreas
J. Eisner
LLMAG
18
26
0
31 May 2023
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Daniel Shin
Anca Dragan
Daniel S. Brown
OffRL
11
53
0
03 Jan 2023
SIRL: Similarity-based Implicit Representation Learning
Andreea Bobu
Yi Liu
Rohin Shah
Daniel S. Brown
Anca Dragan
SSL
DRL
14
17
0
02 Jan 2023
Learning Latent Representations to Co-Adapt to Humans
Sagar Parekh
Dylan P. Losey
18
12
0
19 Dec 2022
Tight Performance Guarantees of Imitator Policies with Continuous Actions
Davide Maran
Alberto Maria Metelli
Marcello Restelli
OffRL
15
4
0
07 Dec 2022
RISO: Combining Rigid Grippers with Soft Switchable Adhesives
Shaunak A. Mehta
Yeunhee Kim
Joshua Hoegerman
Michael D. Bartlett
Dylan P. Losey
11
4
0
27 Oct 2022
Environment Design for Inverse Reinforcement Learning
Thomas Kleine Buening
Victor Villin
Christos Dimitrakakis
30
1
0
26 Oct 2022
Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans
John J. Nay
ELM
AILaw
84
27
0
14 Sep 2022
Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning
David Lindner
Mennatallah El-Assady
OffRL
25
16
0
27 Jun 2022
How to talk so AI will learn: Instructions, descriptions, and autonomy
T. Sumers
Robert D. Hawkins
Mark K. Ho
Thomas L. Griffiths
Dylan Hadfield-Menell
LM&Ro
26
20
0
16 Jun 2022
Self-critiquing models for assisting human evaluators
William Saunders
Catherine Yeh
Jeff Wu
Steven Bills
Ouyang Long
Jonathan Ward
Jan Leike
ALM
ELM
21
279
0
12 Jun 2022
Encouraging Human Interaction with Robot Teams: Legible and Fair Subtask Allocations
Soheil Habibian
Dylan P. Losey
11
10
0
06 May 2022
Inferring Rewards from Language in Context
Jessy Lin
Daniel Fried
Dan Klein
Anca Dragan
LM&Ro
19
54
0
05 Apr 2022
A Primer on Maximum Causal Entropy Inverse Reinforcement Learning
Adam Gleave
Sam Toyer
13
13
0
22 Mar 2022
A Ranking Game for Imitation Learning
Harshit S. Sikchi
Akanksha Saran
Wonjoon Goo
S. Niekum
OffRL
17
22
0
07 Feb 2022
Safe Deep RL in 3D Environments using Human Feedback
Matthew Rahtz
Vikrant Varma
Ramana Kumar
Zachary Kenton
Shane Legg
Jan Leike
24
4
0
20 Jan 2022
On the Expressivity of Markov Reward
David Abel
Will Dabney
A. Harutyunyan
Mark K. Ho
Michael L. Littman
Doina Precup
Satinder Singh
13
82
0
01 Nov 2021
Risk Averse Bayesian Reward Learning for Autonomous Navigation from Human Demonstration
Christian Ellis
Maggie B. Wigness
J. Rogers
Craig T. Lennon
L. Fiondella
65
6
0
31 Jul 2021
The MineRL BASALT Competition on Learning from Human Feedback
Rohin Shah
Cody Wild
Steven H. Wang
Neel Alex
Brandon Houghton
...
Stephanie Milani
Nicholay Topin
Pieter Abbeel
Stuart J. Russell
Anca Dragan
22
30
0
05 Jul 2021
Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Wenshuo Guo
Kumar Krishna Agrawal
Aditya Grover
Vidya Muthukumar
A. Pananjady
11
8
0
28 Jun 2021
Uncertain Decisions Facilitate Better Preference Learning
Cassidy Laidlaw
Stuart J. Russell
25
10
0
19 Jun 2021
Open Problems in Cooperative AI
Allan Dafoe
Edward Hughes
Yoram Bachrach
Tantum Collins
Kevin R. McKee
Joel Z. Leibo
Kate Larson
T. Graepel
11
199
0
15 Dec 2020
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
246
496
0
07 Jun 2018
1