Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2212.03363
Cited By
Few-Shot Preference Learning for Human-in-the-Loop RL
Conference on Robot Learning (CoRL), 2022
6 December 2022
Joey Hejna
Dorsa Sadigh
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Few-Shot Preference Learning for Human-in-the-Loop RL"
50 / 72 papers shown
Title
Safe and Optimal Learning from Preferences via Weighted Temporal Logic with Applications in Robotics and Formula 1
Ruya Karagulle
Cristian-Ioan Vasile
N. Ozay
60
0
0
11 Nov 2025
ARMADA: Autonomous Online Failure Detection and Human Shared Control Empower Scalable Real-world Deployment and Adaptation
Wenye Yu
Jun Lv
Zixi Ying
Yang Jin
Chuan Wen
Cewu Lu
96
0
0
02 Oct 2025
STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
Yao Luan
Ni Mu
Yiqin Yang
Bo Xu
Qing-Shan Jia
73
0
0
28 Sep 2025
Preference-Guided Learning for Sparse-Reward Multi-Agent Reinforcement Learning
Viet The Bui
Tien Mai
Hong Thanh Nguyen
OffRL
112
1
0
26 Sep 2025
Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers
Zahra Aref
Narayan B. Mandayam
OffRL
64
0
0
19 Sep 2025
Interaction-Driven Browsing: A Human-in-the-Loop Conceptual Framework Informed by Human Web Browsing for Browser-Using Agents
Hyeonggeun Yun
Jinkyu Jang
105
0
0
15 Sep 2025
Learning Real-World Acrobatic Flight from Human Preferences
Colin Merk
Ismail Geles
Jiaxu Xing
Angel Romero
Giorgia Ramponi
Davide Scaramuzza
96
0
0
26 Aug 2025
In-situ Value-aligned Human-Robot Interactions with Physical Constraints
Hongtao Li
Ziyuan Jiao
Xiaofeng Liu
Hangxin Liu
Zilong Zheng
60
0
0
11 Aug 2025
Can you see how I learn? Human observers' inferences about Reinforcement Learning agents' learning processes
Adaptive Agents and Multi-Agent Systems (AAMAS), 2025
Bernhard Hilpert
Muhan Hou
Kim Baraka
Joost Broekens
104
0
0
16 Jun 2025
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Tung M. Luu
Younghwan Lee
Donghoon Lee
Sunho Kim
Min Jun Kim
Chang D. Yoo
ALM
VLM
140
6
0
15 Jun 2025
Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
Sara Rajaram
R. J. Cotton
Fabian H. Sinz
133
0
0
14 Jun 2025
MisoDICE: Multi-Agent Imitation from Unlabeled Mixed-Quality Demonstrations
Viet The Bui
Tien Mai
Hong Thanh Nguyen
OffRL
159
2
0
24 May 2025
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations
Jiahui Zhang
Yusen Luo
Abrar Anwar
Sumedh Anand Sontakke
Joseph J Lim
Jesse Thomason
Erdem Biyik
Jesse Zhang
OffRL
LM&Ro
356
15
0
16 May 2025
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
Xiaokun Wang
Chris
Jiangbo Pei
Wei Shen
Yi Peng
...
Ai Jian
Tianyidan Xie
Xuchen Song
Yang Liu
Yahui Zhou
OffRL
LRM
377
11
0
12 May 2025
Adaptive 3D UI Placement in Mixed Reality Using Deep Reinforcement Learning
Feiyu Lu
Mengyu Chen
Hsiang Hsu
Pranav Deshpande
Cheng Yao Wang
Blair MacIntyre
214
6
0
30 Apr 2025
FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions
IEEE International Conference on Robotics and Automation (ICRA), 2025
Daniel Marta
Simon Holk
Miguel Vasco
Jens Lundell
Timon Homberger
F. L. Busch
Olov Andersson
Jens Lundell
Iolanda Leite
333
2
0
14 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
292
26
0
12 Apr 2025
Latent Embedding Adaptation for Human Preference Alignment in Diffusion Planners
IEEE International Conference on Robotics and Automation (ICRA), 2025
Wen Zheng Terence Ng
Jianda Chen
Yuan Xu
Tianwei Zhang
266
0
0
24 Mar 2025
OvercookedV2: Rethinking Overcooked for Zero-Shot Coordination
International Conference on Learning Representations (ICLR), 2025
Tobias Gessler
Tin Dizdarevic
Ani Calinescu
Benjamin Ellis
Andrei Lupu
Jakob Foerster
269
4
0
22 Mar 2025
Generating Robot Constitutions & Benchmarks for Semantic Safety
P. Sermanet
Anirudha Majumdar
A. Irpan
Dmitry Kalashnikov
Vikas Sindhwani
LM&Ro
323
9
0
11 Mar 2025
The Impact of VR and 2D Interfaces on Human Feedback in Preference-Based Robot Learning
Jorge de Heuvel
Daniel Marta
Simon Holk
Iolanda Leite
Maren Bennewitz
268
2
0
11 Mar 2025
Research on Superalignment Should Advance Now with Parallel Optimization of Competence and Conformity
HyunJin Kim
Xiaoyuan Yi
Jing Yao
Muhua Huang
Jinyeong Bak
James Evans
Xing Xie
259
0
0
08 Mar 2025
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
International Conference on Learning Representations (ICLR), 2025
Hyungkyu Kang
Min-hwan Oh
OffRL
249
2
0
07 Mar 2025
Human Implicit Preference-Based Policy Fine-tuning for Multi-Agent Reinforcement Learning in USV Swarm
Haksub Kim
Kanghoon Lee
Minjun Kim
Jiachen Li
Jinkyoo Park
327
3
0
05 Mar 2025
Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
Udita Ghosh
Dripta S. Raychaudhuri
Jiachen Li
Konstantinos Karydis
Amit K. Roy-Chowdhury
VLM
231
1
0
03 Feb 2025
TIMRL: A Novel Meta-Reinforcement Learning Framework for Non-Stationary and Multi-Task Environments
Chenyang Qi
Huiping Li
Panfeng Huang
OffRL
169
0
0
13 Jan 2025
Effects of Robot Competency and Motion Legibility on Human Correction Feedback
IEEE/ACM International Conference on Human-Robot Interaction (HRI), 2025
Shuangge Wang
Anjiabei Wang
Sofiya Goncharova
Brian Scassellati
Tesca Fitzgerald
246
3
0
08 Jan 2025
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Yueqin Yin
Shentao Yang
Yujia Xie
Ziyi Yang
Yuting Sun
Hany Awadalla
Weizhu Chen
Mingyuan Zhou
280
5
0
07 Jan 2025
Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning
Junlin Lu
Patrick Mannion
Karl Mason
187
1
0
30 Sep 2024
Generalizing Alignment Paradigm of Text-to-Image Generation with Preferences through
f
f
f
-divergence Minimization
AAAI Conference on Artificial Intelligence (AAAI), 2024
Haoyuan Sun
Bo Xia
Yongzhe Chang
Xueqian Wang
EGVM
192
17
0
15 Sep 2024
Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
IEEE International Conference on Robotics and Automation (ICRA), 2024
Z. Liu
Junjie Xu
Xingjiao Wu
J. Yang
Liang He
275
1
0
11 Sep 2024
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
International Conference on Machine Learning (ICML), 2024
Heewoong Choi
Sangwon Jung
Hongjoon Ahn
Taesup Moon
OffRL
228
10
0
08 Aug 2024
Offline Imitation Learning Through Graph Search and Retrieval
Zhao-Heng Yin
Pieter Abbeel
OffRL
188
10
0
22 Jul 2024
PECAN: Personalizing Robot Behaviors through a Learned Canonical Space
Heramb Nemlekar
Robert Ramirez Sanchez
Dylan P. Losey
307
4
0
22 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
317
35
0
06 Jul 2024
Safe MPC Alignment with Human Directional Feedback
Zhixian Xie
Wenlong Zhang
Yi Ren
Zhaoran Wang
George J. Pappas
Wanxin Jin
223
2
0
05 Jul 2024
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
Yixiao Wang
Yifei Zhang
Mingxiao Huo
Ran Tian
Xiang Zhang
...
Chenfeng Xu
Pengliang Ji
Wei Zhan
Mingyu Ding
Masayoshi Tomizuka
MoE
270
42
0
01 Jul 2024
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
Taiming Lu
Lingfeng Shen
Xinyu Yang
Weiting Tan
Beidi Chen
Huaxiu Yao
256
4
0
12 Jun 2024
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Adaptive Agents and Multi-Agent Systems (AAMAS), 2024
Calarina Muslimani
Bram Grooten
Deepak Ranganatha Sastry Mamillapalli
Mykola Pechenizkiy
Decebal Constantin Mocanu
Matthew E. Taylor
337
0
0
10 Jun 2024
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
Fengshuo Bai
Rui Zhao
Hongming Zhang
Sijia Cui
Ying Wen
Yaodong Yang
Bo Xu
Lei Han
OffRL
185
13
0
29 May 2024
Revision Matters: Generative Design Guided by Revision Edits
Tao Li
Chin-Yi Cheng
Amber Xie
Gang Li
Yang Li
171
3
0
27 May 2024
Leveraging Human Revisions for Improving Text-to-Layout Models
Amber Xie
Chin-Yi Cheng
Forrest Huang
Yang Li
198
1
0
16 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
338
3
0
30 Apr 2024
Hindsight PRIORs for Reward Learning from Human Preferences
Mudit Verma
Katherine Metcalf
233
10
0
12 Apr 2024
Regularized Conditional Diffusion Model for Multi-Task Preference Alignment
Xudong Yu
Chenjia Bai
Haoran He
Changhong Wang
Xuelong Li
270
8
0
07 Apr 2024
Learning Human Preferences Over Robot Behavior as Soft Planning Constraints
Austin Narcomey
Deyuan Li
Ruta Desai
Nathan Tsoi
261
4
0
28 Mar 2024
LORD: Large Models based Opposite Reward Design for Autonomous Driving
Xin Ye
Feng Tao
Abhirup Mallik
Burhaneddin Yaman
Liu Ren
OffRL
254
6
0
27 Mar 2024
Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks
Ziping Xu
Zifan Xu
Runxuan Jiang
Peter Stone
Ambuj Tewari
272
2
0
03 Mar 2024
Learning with Language-Guided State Abstractions
Andi Peng
Ilia Sucholutsky
Belinda Z. Li
T. Sumers
Thomas Griffiths
Jacob Andreas
Julie A. Shah
LM&Ro
214
16
0
28 Feb 2024
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Jie Cheng
Gang Xiong
Xingyuan Dai
Qinghai Miao
Yisheng Lv
Fei-Yue Wang
236
30
0
27 Feb 2024
1
2
Next