ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.04257
  4. Cited By
Deep Reinforcement Learning from Policy-Dependent Human Feedback

Deep Reinforcement Learning from Policy-Dependent Human Feedback

12 February 2019
Dilip Arumugam
Jun Ki Lee
S. Saskin
Michael L. Littman
ArXiv (abs)PDFHTML

Papers citing "Deep Reinforcement Learning from Policy-Dependent Human Feedback"

50 / 65 papers shown
Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning
Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning
Zhengran Ji
Boyuan Chen
208
1
0
10 Aug 2025
Cognitive Exoskeleton: Augmenting Human Cognition with an AI-Mediated Intelligent Visual Feedback
Cognitive Exoskeleton: Augmenting Human Cognition with an AI-Mediated Intelligent Visual Feedback
Songlin Xu
Xinyu Zhang
87
0
0
09 Jul 2025
CHARM: Considering Human Attributes for Reinforcement Modeling
CHARM: Considering Human Attributes for Reinforcement Modeling
Qidi Fang
Hang Yu
Shijie Fang
Jindan Huang
Qiuyu Chen
Reuben M. Aronson
E. Short
168
2
0
16 Jun 2025
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Tung M. Luu
Younghwan Lee
Donghoon Lee
Sunho Kim
Min Jun Kim
Chang D. Yoo
ALMVLM
202
8
0
15 Jun 2025
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library
Weixun Wang
Shaopan Xiong
Gengru Chen
Wei Gao
Sheng Guo
...
Lin Qu
Yuchi Xu
Wei Wang
Jiamang Wang
Bo Zheng
OffRL
280
45
0
06 Jun 2025
The Latent Space Hypothesis: Toward Universal Medical Representation Learning
Salil Patel
424
16
0
04 Jun 2025
PEO: Improving Bi-Factorial Preference Alignment with Post-Training Policy Extrapolation
Yuxuan Liu
257
0
0
03 Mar 2025
A Comprehensive Survey of Foundation Models in Medicine
A Comprehensive Survey of Foundation Models in MedicineIEEE Reviews in Biomedical Engineering (RBME), 2024
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CELM&MAVLM
779
75
0
17 Jan 2025
CREW: Facilitating Human-AI Teaming Research
CREW: Facilitating Human-AI Teaming Research
Lingyu Zhang
Zhengran Ji
Boyuan Chen
482
7
0
03 Jan 2025
MAP: Multi-Human-Value Alignment Palette
MAP: Multi-Human-Value Alignment PaletteInternational Conference on Learning Representations (ICLR), 2024
Xinran Wang
Qi Le
A. N. Ahmed
Enmao Diao
Yi Zhou
Nathalie Baracaldo
Jie Ding
Ali Anwar
259
12
0
24 Oct 2024
GUIDE: Real-Time Human-Shaped Agents
GUIDE: Real-Time Human-Shaped AgentsNeural Information Processing Systems (NeurIPS), 2024
Lingyu Zhang
Zhengran Ji
Nicholas R Waytowich
Boyuan Chen
217
7
0
19 Oct 2024
Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback
Text2Chart31: Instruction Tuning for Chart Generation with Automatic FeedbackConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Fatemeh Pesaran Zadeh
Juyeon Kim
Jin-Hwa Kim
Gunhee Kim
ALM
353
13
0
05 Oct 2024
CANDERE-COACH: Reinforcement Learning from Noisy Feedback
CANDERE-COACH: Reinforcement Learning from Noisy Feedback
Yuxuan Li
Srijita Das
Matthew E. Taylor
214
2
0
23 Sep 2024
Beyond Following: Mixing Active Initiative into Computational Creativity
Beyond Following: Mixing Active Initiative into Computational Creativity
Zhiyu Lin
Upol Ehsan
Rohan Agarwal
Samihan Dani
Vidushi Vashishth
Mark O. Riedl
236
0
0
06 Sep 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xinyuan Li
Tianyuan Chen
Xiao Zhang
Tianyuan Chen
Xuyang Chen
278
0
0
09 Jul 2024
How Much Progress Did I Make? An Unexplored Human Feedback Signal for Teaching Robots
How Much Progress Did I Make? An Unexplored Human Feedback Signal for Teaching Robots
Hang Yu
Qidi Fang
Shijie Fang
Reuben M. Aronson
E. Short
269
4
0
08 Jul 2024
A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning
  Systems: Survey and Taxonomy
A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning Systems: Survey and Taxonomy
Zhaoxing Li
208
2
0
16 May 2024
Sample-Efficient Preference-based Reinforcement Learning with Dynamics
  Aware Rewards
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
207
10
0
28 Feb 2024
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement
  Learning with Diverse Human Feedback
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
Yifu Yuan
Jianye Hao
Yi-An Ma
Zibin Dong
Hebin Liang
Jinyi Liu
Zhixin Feng
Kai-Wen Zhao
Yan Zheng
OffRLALM
333
19
0
04 Feb 2024
Human-AI Collaboration in Real-World Complex Environment with
  Reinforcement Learning
Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning
Md Saiful Islam
Srijita Das
S. Gottipati
William Duguay
Clodéric Mars
Jalal Arabneydi
Antoine Fagette
Matthew J. Guzdial
Matthew E. Taylor
206
3
0
23 Dec 2023
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for
  Training and Benchmarking Agents that Solve Fuzzy Tasks
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy TasksNeural Information Processing Systems (NeurIPS), 2023
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
Rohin Shah
318
7
0
05 Dec 2023
Accelerating Reinforcement Learning of Robotic Manipulations via
  Feedback from Large Language Models
Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models
Kun-Mo Chu
Xufeng Zhao
C. Weber
Mengdi Li
Stefan Wermter
LLMAGLM&Ro
269
18
0
04 Nov 2023
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Motif: Intrinsic Motivation from Artificial Intelligence FeedbackInternational Conference on Learning Representations (ICLR), 2023
Martin Klissarov
P. DÓro
Shagun Sodhani
Roberta Raileanu
Pierre-Luc Bacon
Pascal Vincent
Amy Zhang
Mikael Henaff
LRMLLMAG
264
89
0
29 Sep 2023
Bootstrapping Adaptive Human-Machine Interfaces with Offline
  Reinforcement Learning
Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement LearningIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Jensen Gao
S. Reddy
Glen Berseth
Anca Dragan
Sergey Levine
OffRL
237
1
0
07 Sep 2023
Primitive Skill-based Robot Learning from Human Evaluative Feedback
Primitive Skill-based Robot Learning from Human Evaluative FeedbackIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Ayano Hiranaka
Minjune Hwang
Sharon Lee
Chen Wang
Li Fei-Fei
Jiajun Wu
Ruohan Zhang
OffRL
217
14
0
28 Jul 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALMOffRL
367
731
0
27 Jul 2023
STRAPPER: Preference-based Reinforcement Learning via Self-training
  Augmentation and Peer Regularization
STRAPPER: Preference-based Reinforcement Learning via Self-training Augmentation and Peer Regularization
Yachen Kang
Li He
Jinxin Liu
Zifeng Zhuang
Xuetao Zhang
356
1
0
19 Jul 2023
Beyond Reward: Offline Preference-guided Policy Optimization
Beyond Reward: Offline Preference-guided Policy OptimizationInternational Conference on Machine Learning (ICML), 2023
Yachen Kang
Dingxu Shi
Jinxin Liu
Li He
Xuetao Zhang
OffRL
222
38
0
25 May 2023
Shattering the Agent-Environment Interface for Fine-Tuning Inclusive
  Language Models
Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Wanqiao Xu
Shi Dong
Dilip Arumugam
Benjamin Van Roy
175
8
0
19 May 2023
Vision-Language Models as Success Detectors
Vision-Language Models as Success Detectors
Yuqing Du
Ksenia Konyushkova
Misha Denil
A. Raju
Jessica Landon
Felix Hill
Nando de Freitas
Serkan Cabi
MLLMLRM
283
116
0
13 Mar 2023
Active Reward Learning from Multiple Teachers
Active Reward Learning from Multiple Teachers
Peter Barnett
Rachel Freedman
Justin Svegliato
Stuart J. Russell
195
17
0
02 Mar 2023
Continual Learning for Instruction Following from Realtime Feedback
Continual Learning for Instruction Following from Realtime FeedbackNeural Information Processing Systems (NeurIPS), 2022
Alane Suhr
Yoav Artzi
292
20
0
19 Dec 2022
Discovering Generalizable Spatial Goal Representations via Graph-based
  Active Reward Learning
Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward LearningInternational Conference on Machine Learning (ICML), 2022
Aviv Netanyahu
Tianmin Shu
J. Tenenbaum
Pulkit Agrawal
149
5
0
24 Nov 2022
Rewards Encoding Environment Dynamics Improves Preference-based
  Reinforcement Learning
Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Katherine Metcalf
Miguel Sarabia
B. Theobald
OffRL
163
5
0
12 Nov 2022
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online
  Videos
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online VideosNeural Information Processing Systems (NeurIPS), 2022
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
499
368
0
23 Jun 2022
Incorporating Voice Instructions in Model-Based Reinforcement Learning
  for Self-Driving Cars
Incorporating Voice Instructions in Model-Based Reinforcement Learning for Self-Driving Cars
Mingze Wang
Ziyang Zhang
Grace Hui Yang
108
1
0
21 Jun 2022
Teachable Reinforcement Learning via Advice Distillation
Teachable Reinforcement Learning via Advice DistillationNeural Information Processing Systems (NeurIPS), 2022
Olivia Watkins
Trevor Darrell
Pieter Abbeel
Jacob Andreas
Abhishek Gupta
OffRL
224
3
0
19 Mar 2022
X2T: Training an X-to-Text Typing Interface with Online Learning from
  User Feedback
X2T: Training an X-to-Text Typing Interface with Online Learning from User FeedbackInternational Conference on Learning Representations (ICLR), 2022
Jensen Gao
S. Reddy
Glen Berseth
Nicholas Hardy
N. Natraj
K. Ganguly
Anca Dragan
Sergey Levine
244
10
0
04 Mar 2022
Efficient Learning of Safe Driving Policy via Human-AI Copilot
  Optimization
Efficient Learning of Safe Driving Policy via Human-AI Copilot OptimizationInternational Conference on Learning Representations (ICLR), 2022
Quanyi Li
Zhenghao Peng
Bolei Zhou
274
76
0
17 Feb 2022
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement
  Learning
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement LearningIEEE International Conference on Robotics and Automation (ICRA), 2022
S. Chen
Jensen Gao
S. Reddy
Glen Berseth
Anca Dragan
Sergey Levine
OffRL
229
19
0
05 Feb 2022
Towards Interactive Reinforcement Learning with Intrinsic Feedback
Towards Interactive Reinforcement Learning with Intrinsic Feedback
Ben Poole
Minwoo Lee
OffRL
282
2
0
02 Dec 2021
B-Pref: Benchmarking Preference-Based Reinforcement Learning
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
332
127
0
04 Nov 2021
Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation
Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation
Eugenio Chisari
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
Abhinav Valada
151
45
0
07 Oct 2021
Cognitive science as a source of forward and inverse models of human
  decisions for robotics and control
Cognitive science as a source of forward and inverse models of human decisions for robotics and control
Mark K. Ho
Thomas Griffiths
279
48
0
01 Sep 2021
Skill Preferences: Learning to Extract and Execute Robotic Skills from
  Human Feedback
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human FeedbackConference on Robot Learning (CoRL), 2021
Xiaofei Wang
Kimin Lee
Kourosh Hakhamaneshi
Pieter Abbeel
Michael Laskin
225
48
0
11 Aug 2021
Continual Learning for Grounded Instruction Generation by Observing
  Human Following Behavior
Continual Learning for Grounded Instruction Generation by Observing Human Following BehaviorTransactions of the Association for Computational Linguistics (TACL), 2021
Noriyuki Kojima
Alane Suhr
Yoav Artzi
184
28
0
10 Aug 2021
Recent Advances in Leveraging Human Guidance for Sequential
  Decision-Making Tasks
Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks
Ruohan Zhang
F. Torabi
Garrett A. Warnell
Peter Stone
330
31
0
13 Jul 2021
Imitation Learning: Progress, Taxonomies and Challenges
Imitation Learning: Progress, Taxonomies and Challenges
Boyuan Zheng
Sunny Verma
Jianlong Zhou
Ivor Tsang
Fang Chen
308
132
0
23 Jun 2021
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
  Relabeling Experience and Unsupervised Pre-training
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-trainingInternational Conference on Machine Learning (ICML), 2021
Kimin Lee
Laura M. Smith
Pieter Abbeel
OffRL
413
355
0
09 Jun 2021
A Survey on Interactive Reinforcement Learning: Design Principles and
  Open Challenges
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
Christian Arzate Cruz
Takeo Igarashi
OffRL
228
103
0
27 May 2021
12
Next
Page 1 of 2