Deep Reinforcement Learning from Policy-Dependent Human Feedback

12 February 2019

Michael L. Littman

Papers citing "Deep Reinforcement Learning from Policy-Dependent Human Feedback"

50 / 65 papers shown

Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning

Zhengran Ji

Boyuan Chen

208

10 Aug 2025

Cognitive Exoskeleton: Augmenting Human Cognition with an AI-Mediated Intelligent Visual Feedback

Songlin Xu

Xinyu Zhang

09 Jul 2025

CHARM: Considering Human Attributes for Reinforcement Modeling

168

16 Jun 2025

Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models

202

15 Jun 2025

Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library

...

280

06 Jun 2025

The Latent Space Hypothesis: Toward Universal Medical Representation Learning

Salil Patel

424

04 Jun 2025

PEO: Improving Bi-Factorial Preference Alignment with Post-Training Policy Extrapolation

Yuxuan Liu

257

03 Mar 2025

A Comprehensive Survey of Foundation Models in MedicineIEEE Reviews in Biomedical Engineering (RBME), 2024

779

17 Jan 2025

CREW: Facilitating Human-AI Teaming Research

Lingyu Zhang

Zhengran Ji

Boyuan Chen

482

03 Jan 2025

MAP: Multi-Human-Value Alignment PaletteInternational Conference on Learning Representations (ICLR), 2024

Nathalie Baracaldo

259

24 Oct 2024

GUIDE: Real-Time Human-Shaped AgentsNeural Information Processing Systems (NeurIPS), 2024

217

19 Oct 2024

Text2Chart31: Instruction Tuning for Chart Generation with Automatic FeedbackConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Fatemeh Pesaran Zadeh

353

05 Oct 2024

CANDERE-COACH: Reinforcement Learning from Noisy Feedback

Yuxuan Li

Srijita Das

Matthew E. Taylor

214

23 Sep 2024

Beyond Following: Mixing Active Initiative into Computational Creativity

236

06 Sep 2024

Preference-Guided Reinforcement Learning for Efficient Exploration

278

09 Jul 2024

How Much Progress Did I Make? An Unexplored Human Feedback Signal for Teaching Robots

269

08 Jul 2024

A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning Systems: Survey and Taxonomy

Zhaoxing Li

208

16 May 2024

Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards

Katherine Metcalf

Miguel Sarabia

Natalie Mackraz

B. Theobald

207

28 Feb 2024

Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback

Jianye Hao

Zibin Dong

Yan Zheng

333

04 Feb 2024

Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning

206

23 Dec 2023

BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy TasksNeural Information Processing Systems (NeurIPS), 2023

318

05 Dec 2023

Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models

269

04 Nov 2023

Motif: Intrinsic Motivation from Artificial Intelligence FeedbackInternational Conference on Learning Representations (ICLR), 2023

Pierre-Luc Bacon

Pascal Vincent

Amy Zhang

Mikael Henaff

LRM LLMAG

264

29 Sep 2023

Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement LearningIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023

237

07 Sep 2023

Primitive Skill-based Robot Learning from Human Evaluative FeedbackIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023

Li Fei-Fei

Jiajun Wu

Ruohan Zhang

OffRL

217

28 Jul 2023

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

...

Dorsa Sadigh

Dylan Hadfield-Menell

ALM OffRL

367

731

27 Jul 2023

STRAPPER: Preference-based Reinforcement Learning via Self-training Augmentation and Peer Regularization

356

19 Jul 2023

Beyond Reward: Offline Preference-guided Policy OptimizationInternational Conference on Machine Learning (ICML), 2023

222

25 May 2023

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

175

19 May 2023

Vision-Language Models as Success Detectors

283

116

13 Mar 2023

Active Reward Learning from Multiple Teachers

195

02 Mar 2023

Continual Learning for Instruction Following from Realtime FeedbackNeural Information Processing Systems (NeurIPS), 2022

Alane Suhr

Yoav Artzi

292

19 Dec 2022

Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward LearningInternational Conference on Machine Learning (ICML), 2022

149

24 Nov 2022

Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning

Katherine Metcalf

Miguel Sarabia

B. Theobald

OffRL

163

12 Nov 2022

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online VideosNeural Information Processing Systems (NeurIPS), 2022

Jeff Clune

499

368

23 Jun 2022

Incorporating Voice Instructions in Model-Based Reinforcement Learning for Self-Driving Cars

Mingze Wang

Ziyang Zhang

Grace Hui Yang

108

21 Jun 2022

Teachable Reinforcement Learning via Advice DistillationNeural Information Processing Systems (NeurIPS), 2022

Pieter Abbeel

Abhishek Gupta

224

19 Mar 2022

X2T: Training an X-to-Text Typing Interface with Online Learning from User FeedbackInternational Conference on Learning Representations (ICLR), 2022

244

04 Mar 2022

Efficient Learning of Safe Driving Policy via Human-AI Copilot OptimizationInternational Conference on Learning Representations (ICLR), 2022

Quanyi Li

Zhenghao Peng

Bolei Zhou

274

17 Feb 2022

ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement LearningIEEE International Conference on Robotics and Automation (ICRA), 2022

229

05 Feb 2022

Towards Interactive Reinforcement Learning with Intrinsic Feedback

Ben Poole

Minwoo Lee

OffRL

282

02 Dec 2021

B-Pref: Benchmarking Preference-Based Reinforcement Learning

Pieter Abbeel

332

127

04 Nov 2021

Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation

Eugenio Chisari

Tim Welschehold

Joschka Boedecker

Wolfram Burgard

Abhinav Valada

151

07 Oct 2021

Cognitive science as a source of forward and inverse models of human decisions for robotics and control

Mark K. Ho

Thomas Griffiths

279

01 Sep 2021

Skill Preferences: Learning to Extract and Execute Robotic Skills from Human FeedbackConference on Robot Learning (CoRL), 2021

Pieter Abbeel

225

11 Aug 2021

Continual Learning for Grounded Instruction Generation by Observing Human Following BehaviorTransactions of the Association for Computational Linguistics (TACL), 2021

Noriyuki Kojima

Alane Suhr

Yoav Artzi

184

10 Aug 2021

Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks

Ruohan Zhang

F. Torabi

Garrett A. Warnell

Peter Stone

330

13 Jul 2021

Imitation Learning: Progress, Taxonomies and Challenges

Boyuan Zheng

Sunny Verma

Jianlong Zhou

Ivor Tsang

Fang Chen

308

132

23 Jun 2021

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-trainingInternational Conference on Machine Learning (ICML), 2021

Kimin Lee

Laura M. Smith

Pieter Abbeel

OffRL

413

355

09 Jun 2021

A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges

Christian Arzate Cruz

Takeo Igarashi

OffRL

228

103

27 May 2021