Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1701.06049
Cited By
v1
v2 (latest)
Interactive Learning from Policy-Dependent Human Feedback
International Conference on Machine Learning (ICML), 2017
21 January 2017
J. MacGlashan
Mark K. Ho
R. Loftin
Bei Peng
Guan Wang
David L. Roberts
Matthew E. Taylor
Michael L. Littman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Interactive Learning from Policy-Dependent Human Feedback"
50 / 189 papers shown
QuickLAP: Quick Language-Action Preference Learning for Autonomous Driving Agents
Jordan Abi Nader
David H. Lee
N. Dennler
Andreea Bobu
177
0
0
22 Nov 2025
Reinforcement Learning from Implicit Neural Feedback for Human-Aligned Robot Control
Suzie Kim
OffRL
96
0
0
18 Nov 2025
Deployable Vision-driven UAV River Navigation via Human-in-the-loop Preference Alignment
Zihan Wang
J. Li
Li-Fan Wu
N. Mahmoudian
187
0
0
02 Nov 2025
Optimistic Task Inference for Behavior Foundation Models
Thomas Rupf
Marco Bagatella
Marin Vlastelica
Andreas Krause
OffRL
152
2
0
23 Oct 2025
TubeDAgger: Reducing the Number of Expert Interventions with Stochastic Reach-Tubes
Julian Lemmel
Manuel Kranzl
Adam Lamine
P. Neubauer
Radu Grosu
Sophie A. Neubauer
112
0
0
01 Oct 2025
Fine-grained Video Dubbing Duration Alignment with Segment Supervised Preference Optimization
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Chaoqun Cui
Liangbin Huang
Shijing Wang
Zhe Tong
Zhaolong Huang
Xiao Zeng
Xiaofeng Liu
153
7
0
12 Aug 2025
Beyond Ordinal Preferences: Why Alignment Needs Cardinal Human Feedback
Parker Whitfill
Stewy Slocum
ALM
102
0
0
11 Aug 2025
Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning
Zhengran Ji
Boyuan Chen
271
2
0
10 Aug 2025
Can you see how I learn? Human observers' inferences about Reinforcement Learning agents' learning processes
Adaptive Agents and Multi-Agent Systems (AAMAS), 2025
Bernhard Hilpert
Muhan Hou
Kim Baraka
Joost Broekens
206
0
0
16 Jun 2025
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Tung M. Luu
Younghwan Lee
Donghoon Lee
Sunho Kim
Min Jun Kim
Chang D. Yoo
ALM
VLM
236
10
0
15 Jun 2025
Think Twice, Act Once: A Co-Evolution Framework of LLM and RL for Large-Scale Decision Making
Xu Wan
Wenyue Xu
Chao Yang
Mingyang Sun
333
6
0
03 Jun 2025
Interactive Imitation Learning for Dexterous Robotic Manipulation: Challenges and Perspectives -- A Survey
Edgar Welte
Rania Rayyes
542
8
0
30 May 2025
Reinforcement Learning from Multi-level and Episodic Human Feedback
Conference on Learning for Dynamics & Control (L4DC), 2025
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
591
0
0
20 Apr 2025
Safe Explicable Policy Search
Akkamahadevi Hanni
Jonathan Montaño
Yu Zhang
398
0
0
10 Mar 2025
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
International Conference on Learning Representations (ICLR), 2025
Hyungkyu Kang
Min-hwan Oh
OffRL
442
3
0
07 Mar 2025
High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects
Jialong Xue
Wei Gao
Y. Wang
Chao Ji
Dongdong Zhao
Shi Yan
Shiwu Zhang
416
6
0
06 Mar 2025
Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
Zhanpeng He
Yifeng Cao
M. Ciocarlie
763
2
0
26 Feb 2025
Extracting and Understanding the Superficial Knowledge in Alignment
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Runjin Chen
Gabriel Jacob Perin
Xuxi Chen
Xilun Chen
Y. Han
Nina S. T. Hirata
Junyuan Hong
B. Kailkhura
396
6
0
07 Feb 2025
CTR-Driven Advertising Image Generation with Multimodal Large Language Models
The Web Conference (WWW), 2025
Xingye Chen
Wei Feng
Zhenbang Du
Weizhen Wang
Yuxiao Chen
...
Jingping Shao
Yuanjie Shao
Xinge You
Changxin Gao
Nong Sang
OffRL
384
13
0
05 Feb 2025
Learning from Active Human Involvement through Proxy Value Propagation
Neural Information Processing Systems (NeurIPS), 2025
Zhenghao Peng
Wenjie Mo
Chenda Duan
Quanyi Li
Bolei Zhou
490
30
0
05 Feb 2025
A Comprehensive Survey of Foundation Models in Medicine
IEEE Reviews in Biomedical Engineering (RBME), 2024
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
855
98
0
17 Jan 2025
Guiding Reinforcement Learning Using Uncertainty-Aware Large Language Models
Maryam Shoaeinaeini
Brent Harrison
153
1
0
15 Nov 2024
Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI
Hadassah Harland
Richard Dazeley
Peter Vamplew
Hashini Senaratne
Bahareh Nakisa
Francisco Cruz
411
4
0
31 Oct 2024
Prosody as a Teaching Signal for Agent Learning: Exploratory Studies and Algorithmic Implications
International Conference on Multimodal Interaction (ICMI), 2024
Matilda Knierim
Sahil Jain
Murat Han Aydoğan
Kenneth Mitra
K. Desai
Akanksha Saran
Kim Baraka
179
1
0
31 Oct 2024
Towards Assurance of LLM Adversarial Robustness using Ontology-Driven Argumentation
Tomas Bueno Momcilovic
Beat Buesser
Giulio Zizzo
Mark Purcell
Tomas Bueno Momcilovic
AAML
160
3
0
10 Oct 2024
Incremental Learning for Robot Shared Autonomy
Yiran Tao
Guixiu Qiao
Dan Ding
Zackory Erickson
CLL
489
2
0
08 Oct 2024
Dynamic Policy Fusion for User Alignment Without Re-Interaction
Ajsal Shereef Palattuparambil
Thommen George Karimpanal
Santu Rana
468
1
0
30 Sep 2024
Align
2
^2
2
LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
Hongzhe Huang
Zhewen Yu
Jiang Liu
Li Cai
Dian Jiao
...
Siliang Tang
Juncheng Li
Hao Jiang
Haoyuan Li
Yueting Zhuang
MLLM
ALM
133
0
0
27 Sep 2024
CANDERE-COACH: Reinforcement Learning from Noisy Feedback
Yuxuan Li
Srijita Das
Matthew E. Taylor
231
2
0
23 Sep 2024
Rater Cohesion and Quality from a Vicarious Perspective
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Deepak Pandita
Tharindu Cyril Weerasooriya
Sujan Dutta
Sarah K. K. Luger
Tharindu Ranasinghe
Ashiqur R. KhudaBukhsh
Marcos Zampieri
Christopher M. Homan
242
5
0
15 Aug 2024
An Introduction to Reinforcement Learning: Fundamental Concepts and Practical Applications
Majid Ghasemi
Amir Hossein Moosavi
Ibrahim Sorkhoh
Anjali Agrawal
Fadi Alzhouri
Dariush Ebrahimi
OffRL
286
1
0
13 Aug 2024
Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data
Junha Song
Tae Soo Kim
Junha Kim
Gunhee Nam
Thijs Kooi
Jaegul Choo
389
4
0
22 Jul 2024
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
570
102
0
20 Jul 2024
A Comparative Analysis of Interactive Reinforcement Learning Algorithms in Warehouse Robot Grid Based Environment
Arunabh Bora
OffRL
144
0
0
16 Jul 2024
Three Dogmas of Reinforcement Learning
David Abel
Mark K. Ho
Anna Harutyunyan
425
12
0
15 Jul 2024
MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention
Yuxin Chen
Chen Tang
Jianglan Wei
Chenran Li
Ran Tian
Xiang Zhang
Wei Zhan
Peter Stone
Masayoshi Tomizuka
472
1
0
24 Jun 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
461
3
0
11 Jun 2024
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Adaptive Agents and Multi-Agent Systems (AAMAS), 2024
Calarina Muslimani
Bram Grooten
Deepak Ranganatha Sastry Mamillapalli
Mykola Pechenizkiy
Decebal Constantin Mocanu
Matthew E. Taylor
451
1
0
10 Jun 2024
Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors
Mengge Xue
Zhenyu Hu
Liqun Liu
Kuo Liao
Shuang Li
Honglin Han
Meng Zhao
Chengguo Yin
355
17
0
03 Jun 2024
Transfer Q Star: Principled Decoding for LLM Alignment
Souradip Chakraborty
Soumya Suvra Ghosal
Ming Yin
Dinesh Manocha
Mengdi Wang
Amrit Singh Bedi
Furong Huang
365
49
0
30 May 2024
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Kihyun Kim
Jiawei Zhang
Asuman Ozdaglar
P. Parrilo
OffRL
358
2
0
20 May 2024
A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning Systems: Survey and Taxonomy
Zhaoxing Li
230
3
0
16 May 2024
Enhancing Maritime Trajectory Forecasting via H3 Index and Causal Language Modelling (CLM)
Nicolas Drapier
Aladine Chetouani
A. Chateigner
167
6
0
15 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
538
5
0
30 Apr 2024
An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models
Yangchen Pan
Junfeng Wen
Chenjun Xiao
Juil Sock
OffRL
MU
365
0
0
23 Apr 2024
Dataset Reset Policy Optimization for RLHF
Jonathan D. Chang
Wenhao Zhan
Owen Oertell
Kianté Brantley
Dipendra Kumar Misra
Jason D. Lee
Wen Sun
OffRL
540
35
0
12 Apr 2024
Explainability in JupyterLab and Beyond: Interactive XAI Systems for Integrated and Collaborative Workflows
G. Guo
Dustin L. Arendt
Alex Endert
300
3
0
02 Apr 2024
Learning to Watermark LLM-generated Text via Reinforcement Learning
Xiaojun Xu
Yuanshun Yao
Yang Liu
384
24
0
13 Mar 2024
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
European Conference on Computer Vision (ECCV), 2024
Qilang Ye
Zitong Yu
Rui Shao
Xinyu Xie
Juil Sock
Simeng Qin
MLLM
489
54
0
07 Mar 2024
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
Andi Nika
Debmalya Mandal
Parameswaran Kamalaruban
Georgios Tzannetos
Goran Radanović
Adish Singla
222
20
0
04 Mar 2024
1
2
3
4
Next
Page 1 of 4