Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03741
Cited By
Deep reinforcement learning from human preferences
12 June 2017
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep reinforcement learning from human preferences"
50 / 691 papers shown
Title
Offline Meta-Reinforcement Learning with Online Self-Supervision
Vitchyr H. Pong
Ashvin Nair
Laura M. Smith
Catherine Huang
Sergey Levine
OffRL
34
66
0
08 Jul 2021
The MineRL BASALT Competition on Learning from Human Feedback
Rohin Shah
Cody Wild
Steven H. Wang
Neel Alex
Brandon Houghton
...
Stephanie Milani
Nicholay Topin
Pieter Abbeel
Stuart J. Russell
Anca Dragan
38
31
0
05 Jul 2021
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
Christian Arzate Cruz
Takeo Igarashi
OffRL
17
94
0
27 May 2021
Make Bipedal Robots Learn How to Imitate
Vishal Kumar
Sinnu Susan Thomas
18
0
0
15 May 2021
Multi-Objective Controller Synthesis with Uncertain Human Preferences
Shenghui Chen
Kayla Boggess
David Parker
Lu Feng
15
1
0
10 May 2021
Preference learning along multiple criteria: A game-theoretic perspective
Kush S. Bhatia
A. Pananjady
Peter L. Bartlett
Anca Dragan
Martin J. Wainwright
32
13
0
05 May 2021
Adapting CRISP-DM for Idea Mining: A Data Mining Process for Generating Ideas Using a Textual Dataset
Ion Dronic
23
55
0
02 May 2021
Revisiting Citizen Science Through the Lens of Hybrid Intelligence
J. Rafner
M. Gajdacz
Gitte Kragh
A. Hjorth
A. Gander
...
J. Miller
Dominik Dellerman
M. Haklay
Pietro Michelucci
J. Sherson
16
13
0
30 Apr 2021
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior
Md Sultan al Nahian
Spencer Frazier
Brent Harrison
Mark O. Riedl
27
18
0
19 Apr 2021
Model Learning with Personalized Interpretability Estimation (ML-PIE)
M. Virgolin
A. D. Lorenzo
Francesca Randone
Eric Medvet
M. Wahde
24
30
0
13 Apr 2021
SkiffOS: Minimal Cross-compiled Linux for Embedded Containers
Christian Stewart
36
8
0
31 Mar 2021
Unsupervised Feature Learning for Manipulation with Contrastive Domain Randomization
Carmel Rabinovitz
Niko A. Grupen
Aviv Tamar
OOD
SSL
28
3
0
20 Mar 2021
Self-Supervised Online Reward Shaping in Sparse-Reward Environments
F. Memarian
Wonjoon Goo
Rudolf Lioutikov
S. Niekum
Ufuk Topcu
OffRL
34
48
0
08 Mar 2021
Preference-based Learning of Reward Function Features
Sydney M. Katz
Amir Maleki
Erdem Biyik
Mykel J. Kochenderfer
33
11
0
03 Mar 2021
How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned
Julian Ibarz
Jie Tan
Chelsea Finn
Mrinal Kalakrishnan
P. Pastor
Sergey Levine
OffRL
16
517
0
04 Feb 2021
Open Problems in Cooperative AI
Allan Dafoe
Edward Hughes
Yoram Bachrach
Tantum Collins
Kevin R. McKee
Joel Z Leibo
Kate Larson
T. Graepel
42
200
0
15 Dec 2020
Understanding Learned Reward Functions
Eric J. Michaud
Adam Gleave
Stuart J. Russell
XAI
OffRL
30
33
0
10 Dec 2020
Inverse Constrained Reinforcement Learning
Usman Anwar
Shehryar Malik
Alireza Aghasi
Ali Ahmed
18
58
0
19 Nov 2020
Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning Approach
Huixin Zhan
Feng Tao
Yongcan Cao
38
26
0
15 Oct 2020
A One-bit, Comparison-Based Gradient Estimator
HanQin Cai
Daniel McKenzie
W. Yin
Zhenliang Zhang
38
17
0
06 Oct 2020
Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning
Rodrigo Toro Icarte
Toryn Q. Klassen
Richard Valenzano
Sheila A. McIlraith
OffRL
44
216
0
06 Oct 2020
Emergent Social Learning via Multi-agent Reinforcement Learning
Kamal Ndousse
Douglas Eck
Sergey Levine
Natasha Jaques
10
41
0
01 Oct 2020
Learning Rewards from Linguistic Feedback
T. Sumers
Mark K. Ho
Robert D. Hawkins
Karthik Narasimhan
Thomas Griffiths
40
51
0
30 Sep 2020
Learning to summarize from human feedback
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
62
2,002
0
02 Sep 2020
Maximizing BCI Human Feedback using Active Learning
Zizhao Wang
Junyao Shi
Iretiayo Akinola
Peter K. Allen
24
8
0
11 Aug 2020
Sequential Motion Planning for Bipedal Somersault via Flywheel SLIP and Momentum Transmission with Task Space Control
Xiaobin Xiong
Aaron D. Ames
26
13
0
06 Aug 2020
Aligning AI With Shared Human Values
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jingkai Li
D. Song
Jacob Steinhardt
63
522
0
05 Aug 2020
Weak Human Preference Supervision For Deep Reinforcement Learning
Zehong Cao
Kaichiu Wong
Chin-Teng Lin
16
5
0
25 Jul 2020
Accelerating Reinforcement Learning Agent with EEG-based Implicit Human Feedback
Duo Xu
Mohit Agarwal
Ekansh Gupta
Faramarz Fekri
Raghupathy Sivakumar
OffRL
30
11
0
30 Jun 2020
Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation
L. Guan
Mudit Verma
Sihang Guo
Ruohan Zhang
Subbarao Kambhampati
43
42
0
26 Jun 2020
Feature Expansive Reward Learning: Rethinking Human Input
Andreea Bobu
Marius Wiggert
Claire Tomlin
Anca Dragan
27
44
0
23 Jun 2020
Preference-based Reinforcement Learning with Finite-Time Guarantees
Yichong Xu
Ruosong Wang
Lin F. Yang
Aarti Singh
A. Dubrawski
33
53
0
16 Jun 2020
Avoiding Side Effects in Complex Environments
Alexander Matt Turner
Neale Ratzlaff
Prasad Tadepalli
30
34
0
11 Jun 2020
AI Research Considerations for Human Existential Safety (ARCHES)
Andrew Critch
David M. Krueger
30
50
0
30 May 2020
Weakly-Supervised Reinforcement Learning for Controllable Behavior
Lisa Lee
Benjamin Eysenbach
Ruslan Salakhutdinov
S. Gu
Chelsea Finn
SSL
22
26
0
06 Apr 2020
Socially-Aware Robot Planning via Bandit Human Feedback
Xusheng Luo
Yan Zhang
Michael M. Zavlanos
11
17
0
02 Mar 2020
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences
Daniel S. Brown
Russell Coleman
R. Srinivasan
S. Niekum
BDL
32
101
0
21 Feb 2020
Reward-rational (implicit) choice: A unifying formalism for reward learning
Hong Jun Jeon
S. Milli
Anca Dragan
17
176
0
12 Feb 2020
Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections
Andreea Bobu
Andrea V. Bajcsy
J. F. Fisac
Sampada Deglurkar
Anca Dragan
30
41
0
03 Feb 2020
Model Inversion Networks for Model-Based Optimization
Aviral Kumar
Sergey Levine
OffRL
35
93
0
31 Dec 2019
A Survey of Deep Reinforcement Learning in Video Games
Kun Shao
Zhentao Tang
Yuanheng Zhu
Nannan Li
Dongbin Zhao
OffRL
AI4TS
43
188
0
23 Dec 2019
Efficient Parameter Sampling for Neural Network Construction
Drimik Roy Chowdhury
M. F. Kasim
BDL
27
2
0
22 Dec 2019
AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos
Laura M. Smith
Nikita Dhawan
Marvin Zhang
Pieter Abbeel
Sergey Levine
41
156
0
10 Dec 2019
SafeLife 1.0: Exploring Side Effects in Complex Environments
Carroll L. Wainwright
P. Eckersley
27
12
0
03 Dec 2019
Assistive Gym: A Physics Simulation Framework for Assistive Robotics
Zackory M. Erickson
Vamsee Gangaram
Ariel Kapusta
Chenxi Liu
Charles C. Kemp
14
109
0
10 Oct 2019
Asking Easy Questions: A User-Friendly Approach to Active Reward Learning
Erdem Biyik
Malayandi Palan
Nicholas C. Landolfi
Dylan P. Losey
Dorsa Sadigh
24
113
0
10 Oct 2019
Scaling data-driven robotics with reward sketching and batch reinforcement learning
Serkan Cabi
Sergio Gomez Colmenarejo
Alexander Novikov
Ksenia Konyushkova
Scott E. Reed
...
David Barker
Jonathan Scholz
Misha Denil
Nando de Freitas
Ziyun Wang
OffRL
28
29
0
26 Sep 2019
Leveraging Human Guidance for Deep Reinforcement Learning Tasks
Ruohan Zhang
F. Torabi
L. Guan
D. Ballard
Peter Stone
19
87
0
21 Sep 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
301
1,616
0
18 Sep 2019
Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
Kristian Hartikainen
Xinyang Geng
Tuomas Haarnoja
Sergey Levine
SSL
40
74
0
18 Jul 2019
Previous
1
2
3
...
12
13
14
Next