ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03741
  4. Cited By
Deep reinforcement learning from human preferences

Deep reinforcement learning from human preferences

12 June 2017
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
ArXivPDFHTML

Papers citing "Deep reinforcement learning from human preferences"

50 / 691 papers shown
Title
The Alignment Problem from a Deep Learning Perspective
The Alignment Problem from a Deep Learning Perspective
Richard Ngo
Lawrence Chan
Sören Mindermann
68
183
0
30 Aug 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
234
449
0
23 Aug 2022
Comparison-based Conversational Recommender System with Relative Bandit
  Feedback
Comparison-based Conversational Recommender System with Relative Bandit Feedback
Zhihui Xie
Tong Yu
Canzhe Zhao
Shuai Li
25
39
0
21 Aug 2022
Calculus on MDPs: Potential Shaping as a Gradient
Calculus on MDPs: Potential Shaping as a Gradient
Erik Jenner
H. V. Hoof
Adam Gleave
22
4
0
20 Aug 2022
Transformers are Adaptable Task Planners
Transformers are Adaptable Task Planners
Vidhi Jain
Yixin Lin
Eric Undersander
Yonatan Bisk
Akshara Rai
25
24
0
06 Jul 2022
Humans are not Boltzmann Distributions: Challenges and Opportunities for
  Modelling Human Feedback and Interaction in Reinforcement Learning
Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning
David Lindner
Mennatallah El-Assady
OffRL
35
16
0
27 Jun 2022
A General Recipe for Likelihood-free Bayesian Optimization
A General Recipe for Likelihood-free Bayesian Optimization
Jiaming Song
Lantao Yu
Willie Neiswanger
Stefano Ermon
38
23
0
27 Jun 2022
Good Time to Ask: A Learning Framework for Asking for Help in Embodied
  Visual Navigation
Good Time to Ask: A Learning Framework for Asking for Help in Embodied Visual Navigation
Jenny Zhang
Samson Yu
Jiafei Duan
Cheston Tan
41
4
0
20 Jun 2022
How to talk so AI will learn: Instructions, descriptions, and autonomy
How to talk so AI will learn: Instructions, descriptions, and autonomy
T. Sumers
Robert D. Hawkins
Mark K. Ho
Thomas Griffiths
Dylan Hadfield-Menell
LM&Ro
38
20
0
16 Jun 2022
Contrastive Learning as Goal-Conditioned Reinforcement Learning
Contrastive Learning as Goal-Conditioned Reinforcement Learning
Benjamin Eysenbach
Tianjun Zhang
Ruslan Salakhutdinov
Sergey Levine
SSL
OffRL
39
141
0
15 Jun 2022
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
Yinglun Xu
Qi Zeng
Gagandeep Singh
AAML
40
6
0
30 May 2022
Reward Uncertainty for Exploration in Preference-based Reinforcement
  Learning
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning
Xinran Liang
Katherine Shu
Kimin Lee
Pieter Abbeel
21
58
0
24 May 2022
Towards Automated Document Revision: Grammatical Error Correction,
  Fluency Edits, and Beyond
Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond
Masato Mita
Keisuke Sakaguchi
Masato Hagiwara
Tomoya Mizumoto
Jun Suzuki
Kentaro Inui
48
15
0
23 May 2022
Learning Dense Reward with Temporal Variant Self-Supervision
Learning Dense Reward with Temporal Variant Self-Supervision
Yuning Wu
Jieliang Luo
Hui Li
SSL
21
0
0
20 May 2022
Aligning Robot Representations with Humans
Aligning Robot Representations with Humans
Andreea Bobu
Andi Peng
27
0
0
15 May 2022
Perspectives on Incorporating Expert Feedback into Model Updates
Perspectives on Incorporating Expert Feedback into Model Updates
Valerie Chen
Umang Bhatt
Hoda Heidari
Adrian Weller
Ameet Talwalkar
37
11
0
13 May 2022
Adversarial Training for High-Stakes Reliability
Adversarial Training for High-Stakes Reliability
Daniel M. Ziegler
Seraphina Nix
Lawrence Chan
Tim Bauman
Peter Schmidt-Nielsen
...
Noa Nabeshima
Benjamin Weinstein-Raun
D. Haas
Buck Shlegeris
Nate Thomas
AAML
38
59
0
03 May 2022
Counterfactual harm
Counterfactual harm
Jonathan G. Richens
R. Beard
Daniel H. Thompson
31
27
0
27 Apr 2022
Mind the gap: Challenges of deep learning approaches to Theory of Mind
Mind the gap: Challenges of deep learning approaches to Theory of Mind
Jaan Aru
Aqeel Labash
Oriol Corcoll
Raul Vicente
28
26
0
30 Mar 2022
Uncertainty Estimation for Language Reward Models
Uncertainty Estimation for Language Reward Models
Adam Gleave
G. Irving
UQLM
42
31
0
14 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
384
12,081
0
04 Mar 2022
X2T: Training an X-to-Text Typing Interface with Online Learning from
  User Feedback
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback
Jensen Gao
S. Reddy
Glen Berseth
Nicholas Hardy
N. Natraj
K. Ganguly
Anca Dragan
Sergey Levine
23
10
0
04 Mar 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
38
9
0
23 Feb 2022
A Ranking Game for Imitation Learning
A Ranking Game for Imitation Learning
Harshit S. Sikchi
Akanksha Saran
Wonjoon Goo
S. Niekum
OffRL
27
22
0
07 Feb 2022
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement
  Learning
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning
S. Chen
Jensen Gao
S. Reddy
Glen Berseth
Anca Dragan
Sergey Levine
OffRL
36
11
0
05 Feb 2022
Knowledge-Integrated Informed AI for National Security
Knowledge-Integrated Informed AI for National Security
Anu Myne
Kevin J. Leahy
Ryan Soklaski
21
0
0
04 Feb 2022
Safe Deep RL in 3D Environments using Human Feedback
Safe Deep RL in 3D Environments using Human Feedback
Matthew Rahtz
Vikrant Varma
Ramana Kumar
Zachary Kenton
Shane Legg
Jan Leike
32
4
0
20 Jan 2022
Inducing Structure in Reward Learning by Learning Features
Inducing Structure in Reward Learning by Learning Features
Andreea Bobu
Marius Wiggert
Claire Tomlin
Anca Dragan
27
30
0
18 Jan 2022
The Effects of Reward Misspecification: Mapping and Mitigating
  Misaligned Models
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models
Alexander Pan
Kush S. Bhatia
Jacob Steinhardt
53
172
0
10 Jan 2022
Creating Multimodal Interactive Agents with Imitation and
  Self-Supervised Learning
Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning
DeepMind Interactive Agents Team Josh Abramson
Josh Abramson
Arun Ahuja
Arthur Brussee
Federico Carnevale
...
Tamara von Glehn
Greg Wayne
Nathaniel Wong
Chen Yan
Rui Zhu
LM&Ro
45
46
0
07 Dec 2021
Combining Learning from Human Feedback and Knowledge Engineering to
  Solve Hierarchical Tasks in Minecraft
Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft
Vinicius G. Goecks
Nicholas R. Waytowich
David Watkins
Bharat Prakash
13
7
0
07 Dec 2021
Dueling RL: Reinforcement Learning with Trajectory Preferences
Dueling RL: Reinforcement Learning with Trajectory Preferences
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
33
82
0
08 Nov 2021
B-Pref: Benchmarking Preference-Based Reinforcement Learning
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
42
93
0
04 Nov 2021
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement
  Learning
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning
Sabela Ramos
Sertan Girgin
Léonard Hussenot
Damien Vincent
Hanna Yakubovich
...
Piotr Stańczyk
Raphaël Marinier
Jeremiah Harmsen
Olivier Pietquin
Nikola Momchev
OffRL
38
24
0
04 Nov 2021
On the Expressivity of Markov Reward
On the Expressivity of Markov Reward
David Abel
Will Dabney
Anna Harutyunyan
Mark K. Ho
Michael L. Littman
Doina Precup
Satinder Singh
29
82
0
01 Nov 2021
Collaborating with Humans without Human Data
Collaborating with Humans without Human Data
D. Strouse
Kevin R. McKee
M. Botvinick
Edward Hughes
Richard Everett
124
161
0
15 Oct 2021
A Framework for Learning to Request Rich and Contextually Useful
  Information from Humans
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
Khanh Nguyen
Yonatan Bisk
Hal Daumé
49
16
0
14 Oct 2021
Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation
Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation
Eugenio Chisari
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
Abhinav Valada
19
37
0
07 Oct 2021
Learning Multimodal Rewards from Rankings
Learning Multimodal Rewards from Rankings
Vivek Myers
Erdem Biyik
Nima Anari
Dorsa Sadigh
OffRL
32
49
0
27 Sep 2021
Recursively Summarizing Books with Human Feedback
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
37
296
0
22 Sep 2021
Symbols as a Lingua Franca for Bridging Human-AI Chasm for Explainable
  and Advisable AI Systems
Symbols as a Lingua Franca for Bridging Human-AI Chasm for Explainable and Advisable AI Systems
Subbarao Kambhampati
S. Sreedharan
Mudit Verma
Yantian Zha
L. Guan
52
47
0
21 Sep 2021
ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive
  Imitation Learning
ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive Imitation Learning
Ryan Hoque
Ashwin Balakrishna
Ellen R. Novoseller
Albert Wilcox
Daniel S. Brown
Ken Goldberg
35
84
0
17 Sep 2021
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
Tianhe Yu
Aviral Kumar
Yevgen Chebotar
Karol Hausman
Sergey Levine
Chelsea Finn
OffRL
35
77
0
16 Sep 2021
Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning
Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning
Ning Wei
Jiahua Liang
Di Xie
Shiliang Pu
25
0
0
06 Sep 2021
APReL: A Library for Active Preference-based Reward Learning Algorithms
APReL: A Library for Active Preference-based Reward Learning Algorithms
Erdem Biyik
Aditi Talati
Dorsa Sadigh
20
36
0
16 Aug 2021
Skill Preferences: Learning to Extract and Execute Robotic Skills from
  Human Feedback
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback
Xiaofei Wang
Kimin Lee
Kourosh Hakhamaneshi
Pieter Abbeel
Michael Laskin
34
42
0
11 Aug 2021
Continual Learning for Grounded Instruction Generation by Observing
  Human Following Behavior
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
Noriyuki Kojima
Alane Suhr
Yoav Artzi
27
24
0
10 Aug 2021
Accelerating the Learning of TAMER with Counterfactual Explanations
Accelerating the Learning of TAMER with Counterfactual Explanations
Jakob Karalus
F. Lindner
OffRL
29
4
0
03 Aug 2021
Differential-Critic GAN: Generating What You Want by a Cue of
  Preferences
Differential-Critic GAN: Generating What You Want by a Cue of Preferences
Yinghua Yao
Yuangang Pan
Ivor W. Tsang
Xin Yao
DiffM
28
0
0
14 Jul 2021
Recent Advances in Leveraging Human Guidance for Sequential
  Decision-Making Tasks
Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks
Ruohan Zhang
F. Torabi
Garrett A. Warnell
Peter Stone
83
28
0
13 Jul 2021
Previous
123...11121314
Next