ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03741
  4. Cited By
Deep reinforcement learning from human preferences

Deep reinforcement learning from human preferences

12 June 2017
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
ArXivPDFHTML

Papers citing "Deep reinforcement learning from human preferences"

41 / 691 papers shown
Title
Learning an Urban Air Mobility Encounter Model from Expert Preferences
Learning an Urban Air Mobility Encounter Model from Expert Preferences
Sydney M. Katz
Anne-Claire Le Bihan
Mykel J. Kochenderfer
27
17
0
12 Jul 2019
Learning to Interactively Learn and Assist
Learning to Interactively Learn and Assist
Mark P. Woodward
Chelsea Finn
Karol Hausman
27
33
0
24 Jun 2019
Using Human Ratings for Feedback Control: A Supervised Learning Approach
  with Application to Rehabilitation Robotics
Using Human Ratings for Feedback Control: A Supervised Learning Approach with Application to Rehabilitation Robotics
Marcel Menner
Lukas Neuner
L. Lünenburger
Melanie Zeilinger
19
11
0
24 Jun 2019
Batch Active Learning Using Determinantal Point Processes
Batch Active Learning Using Determinantal Point Processes
Erdem Biyik
Kenneth Wang
Nima Anari
Dorsa Sadigh
27
61
0
19 Jun 2019
End-to-End Robotic Reinforcement Learning without Reward Engineering
End-to-End Robotic Reinforcement Learning without Reward Engineering
Avi Singh
Larry Yang
Kristian Hartikainen
Chelsea Finn
Sergey Levine
SSL
OffRL
46
266
0
16 Apr 2019
Multi-Preference Actor Critic
Multi-Preference Actor Critic
Ishan Durugkar
Matthew J. Hausknecht
Adith Swaminathan
Patrick MacAlpine
19
1
0
05 Apr 2019
Informed Machine Learning -- A Taxonomy and Survey of Integrating
  Knowledge into Learning Systems
Informed Machine Learning -- A Taxonomy and Survey of Integrating Knowledge into Learning Systems
Laura von Rueden
S. Mayer
Katharina Beckh
B. Georgiev
Sven Giesselbach
...
Rajkumar Ramamurthy
Michal Walczak
Jochen Garcke
Christian Bauckhage
Jannis Schuecker
39
626
0
29 Mar 2019
On the Pitfalls of Measuring Emergent Communication
On the Pitfalls of Measuring Emergent Communication
Ryan J. Lowe
Jakob N. Foerster
Y-Lan Boureau
Joelle Pineau
Yann N. Dauphin
28
131
0
12 Mar 2019
Conservative Agency via Attainable Utility Preservation
Conservative Agency via Attainable Utility Preservation
Alexander Matt Turner
Dylan Hadfield-Menell
Prasad Tadepalli
27
49
0
26 Feb 2019
Learning to Generalize from Sparse and Underspecified Rewards
Learning to Generalize from Sparse and Underspecified Rewards
Rishabh Agarwal
Chen Liang
Dale Schuurmans
Mohammad Norouzi
OffRL
54
97
0
19 Feb 2019
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Dilip Arumugam
Jun Ki Lee
S. Saskin
Michael L. Littman
28
94
0
12 Feb 2019
Risk-Aware Active Inverse Reinforcement Learning
Risk-Aware Active Inverse Reinforcement Learning
Daniel S. Brown
Yuchen Cui
S. Niekum
27
58
0
08 Jan 2019
Deep Reinforcement Learning for Multi-Agent Systems: A Review of
  Challenges, Solutions and Applications
Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications
Thanh Thi Nguyen
Ngoc Duy Nguyen
S. Nahavandi
27
774
0
31 Dec 2018
Residual Reinforcement Learning for Robot Control
Residual Reinforcement Learning for Robot Control
T. Johannink
Shikhar Bahl
Ashvin Nair
Jianlan Luo
Avinash Kumar
M. Loskyll
J. A. Ojea
Eugen Solowjow
Sergey Levine
OffRL
30
409
0
07 Dec 2018
Guiding Policies with Language via Meta-Learning
Guiding Policies with Language via Meta-Learning
John D. Co-Reyes
Abhishek Gupta
Suvansh Sanjeev
Nick Altieri
Jacob Andreas
John DeNero
Pieter Abbeel
Sergey Levine
LM&Ro
26
63
0
19 Nov 2018
Scalable agent alignment via reward modeling: a research direction
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
34
397
0
19 Nov 2018
Towards Governing Agent's Efficacy: Action-Conditional $β$-VAE for
  Deep Transparent Reinforcement Learning
Towards Governing Agent's Efficacy: Action-Conditional βββ-VAE for Deep Transparent Reinforcement Learning
John Yang
Gyujeong Lee
Minsung Hyun
Simyung Chang
Nojun Kwak
29
3
0
11 Nov 2018
Supervising strong learners by amplifying weak experts
Supervising strong learners by amplifying weak experts
Paul Christiano
Buck Shlegeris
Dario Amodei
27
114
0
19 Oct 2018
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language
  Learning
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
Maxime Chevalier-Boisvert
Dzmitry Bahdanau
Salem Lahlou
Lucas Willems
Chitwan Saharia
Thien Huu Nguyen
Yoshua Bengio
ELM
33
232
0
18 Oct 2018
Learning under Misspecified Objective Spaces
Learning under Misspecified Objective Spaces
Andreea Bobu
Andrea V. Bajcsy
J. F. Fisac
Anca Dragan
16
30
0
11 Oct 2018
Batch Active Preference-Based Learning of Reward Functions
Batch Active Preference-Based Learning of Reward Functions
Erdem Biyik
Dorsa Sadigh
22
108
0
10 Oct 2018
Few-Shot Goal Inference for Visuomotor Learning and Planning
Few-Shot Goal Inference for Visuomotor Learning and Planning
Annie Xie
Avi Singh
Sergey Levine
Chelsea Finn
OffRL
40
71
0
30 Sep 2018
Emergence of Human-comparable Balancing Behaviors by Deep Reinforcement
  Learning
Emergence of Human-comparable Balancing Behaviors by Deep Reinforcement Learning
Chuanyu Yang
Taku Komura
Zhibin Li
29
20
0
06 Sep 2018
APRIL: Interactively Learning to Summarise by Combining Active
  Preference Learning and Reinforcement Learning
APRIL: Interactively Learning to Summarise by Combining Active Preference Learning and Reinforcement Learning
Yang Gao
Christian M. Meyer
Iryna Gurevych
21
34
0
29 Aug 2018
Multi-Agent Deep Reinforcement Learning with Human Strategies
Multi-Agent Deep Reinforcement Learning with Human Strategies
Thanh Nguyen
Ngoc Duy Nguyen
S. Nahavandi
27
12
0
12 Jun 2018
Learning to Understand Goal Specifications by Modelling Reward
Learning to Understand Goal Specifications by Modelling Reward
Dzmitry Bahdanau
Felix Hill
Jan Leike
Edward Hughes
Seyedarian Hosseini
Pushmeet Kohli
Edward Grefenstette
24
157
0
05 Jun 2018
Human-in-the-Loop Interpretability Prior
Human-in-the-Loop Interpretability Prior
Isaac Lage
A. Ross
Been Kim
S. Gershman
Finale Doshi-Velez
32
120
0
29 May 2018
Reliability and Learnability of Human Bandit Feedback for
  Sequence-to-Sequence Reinforcement Learning
Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning
Julia Kreutzer
Joshua Uyheng
Stefan Riezler
30
85
0
27 May 2018
Learning Self-Imitating Diverse Policies
Learning Self-Imitating Diverse Policies
Tanmay Gangwani
Qiang Liu
Jian Peng
27
65
0
25 May 2018
Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Joshua Romoff
Peter Henderson
Alexandre Piché
Vincent François-Lavet
Joelle Pineau
6
42
0
09 May 2018
AGI Safety Literature Review
AGI Safety Literature Review
Tom Everitt
G. Lea
Marcus Hutter
AI4CE
36
115
0
03 May 2018
Customized Image Narrative Generation via Interactive Visual Question
  Generation and Answering
Customized Image Narrative Generation via Interactive Visual Question Generation and Answering
Andrew Shin
Yoshitaka Ushiku
Tatsuya Harada
44
7
0
27 Apr 2018
Hierarchical Imitation and Reinforcement Learning
Hierarchical Imitation and Reinforcement Learning
Hoang Minh Le
Nan Jiang
Alekh Agarwal
Miroslav Dudík
Yisong Yue
Hal Daumé
22
190
0
01 Mar 2018
Evolved Policy Gradients
Evolved Policy Gradients
Rein Houthooft
Richard Y. Chen
Phillip Isola
Bradly C. Stadie
Filip Wolski
Jonathan Ho
Pieter Abbeel
49
227
0
13 Feb 2018
Learning from Richer Human Guidance: Augmenting Comparison-Based
  Learning with Feature Queries
Learning from Richer Human Guidance: Augmenting Comparison-Based Learning with Feature Queries
Chandrayee Basu
M. Singhal
Anca Dragan
28
57
0
05 Feb 2018
AI Safety Gridworlds
AI Safety Gridworlds
Jan Leike
Miljan Martic
Victoria Krakovna
Pedro A. Ortega
Tom Everitt
Andrew Lefrancq
Laurent Orseau
Shane Legg
29
250
0
27 Nov 2017
Improving image generative models with human interactions
Improving image generative models with human interactions
Andrew Kyle Lampinen
David R. So
Douglas Eck
Fred Bertsch
GAN
17
3
0
29 Sep 2017
Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces
Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces
Garrett A. Warnell
Nicholas R. Waytowich
Vernon J. Lawhern
Peter Stone
13
266
0
28 Sep 2017
Explore, Exploit or Listen: Combining Human Feedback and Policy Model to
  Speed up Deep Reinforcement Learning in 3D Worlds
Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds
Zhiyu Lin
Brent Harrison
A. Keech
Mark O. Riedl
17
37
0
12 Sep 2017
Trial without Error: Towards Safe Reinforcement Learning via Human
  Intervention
Trial without Error: Towards Safe Reinforcement Learning via Human Intervention
William Saunders
Girish Sastry
Andreas Stuhlmuller
Owain Evans
OffRL
24
229
0
17 Jul 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,503
0
25 Jan 2017
Previous
123...121314