Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2212.03363
Cited By
Few-Shot Preference Learning for Human-in-the-Loop RL
Conference on Robot Learning (CoRL), 2022
6 December 2022
Joey Hejna
Dorsa Sadigh
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Few-Shot Preference Learning for Human-in-the-Loop RL"
50 / 72 papers shown
Safe and Optimal Learning from Preferences via Weighted Temporal Logic with Applications in Robotics and Formula 1
Ruya Karagulle
Cristian-Ioan Vasile
N. Ozay
96
0
0
11 Nov 2025
ARMADA: Autonomous Online Failure Detection and Human Shared Control Empower Scalable Real-world Deployment and Adaptation
Wenye Yu
Jun Lv
Zixi Ying
Yang Jin
Chuan Wen
Cewu Lu
135
0
0
02 Oct 2025
STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
Yao Luan
Ni Mu
Yiqin Yang
Bo Xu
Qing-Shan Jia
102
0
0
28 Sep 2025
Preference-Guided Learning for Sparse-Reward Multi-Agent Reinforcement Learning
Viet The Bui
Tien Mai
Hong Thanh Nguyen
OffRL
176
0
0
26 Sep 2025
Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers
Zahra Aref
Narayan B. Mandayam
OffRL
112
0
0
19 Sep 2025
Interaction-Driven Browsing: A Human-in-the-Loop Conceptual Framework Informed by Human Web Browsing for Browser-Using Agents
Hyeonggeun Yun
Jinkyu Jang
152
1
0
15 Sep 2025
Learning Real-World Acrobatic Flight from Human Preferences
Colin Merk
Ismail Geles
Jiaxu Xing
Angel Romero
Giorgia Ramponi
Davide Scaramuzza
124
0
0
26 Aug 2025
In-situ Value-aligned Human-Robot Interactions with Physical Constraints
Hongtao Li
Ziyuan Jiao
Xiaofeng Liu
Hangxin Liu
Zilong Zheng
125
0
0
11 Aug 2025
Can you see how I learn? Human observers' inferences about Reinforcement Learning agents' learning processes
Adaptive Agents and Multi-Agent Systems (AAMAS), 2025
Bernhard Hilpert
Muhan Hou
Kim Baraka
Joost Broekens
125
0
0
16 Jun 2025
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Tung M. Luu
Younghwan Lee
Donghoon Lee
Sunho Kim
Min Jun Kim
Chang D. Yoo
ALM
VLM
199
6
0
15 Jun 2025
Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
Sara Rajaram
R. J. Cotton
Fabian H. Sinz
181
1
0
14 Jun 2025
MisoDICE: Multi-Agent Imitation from Unlabeled Mixed-Quality Demonstrations
Viet The Bui
Tien Mai
Hong Thanh Nguyen
OffRL
200
2
0
24 May 2025
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations
Jiahui Zhang
Yusen Luo
Abrar Anwar
Sumedh Anand Sontakke
Joseph J Lim
Jesse Thomason
Erdem Biyik
Jesse Zhang
OffRL
LM&Ro
422
18
0
16 May 2025
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
Xiaokun Wang
Chris
Jiangbo Pei
Wei Shen
Yi Peng
...
Ai Jian
Tianyidan Xie
Xuchen Song
Yang Liu
Yahui Zhou
OffRL
LRM
473
12
0
12 May 2025
Adaptive 3D UI Placement in Mixed Reality Using Deep Reinforcement Learning
Feiyu Lu
Mengyu Chen
Hsiang Hsu
Pranav Deshpande
Cheng Yao Wang
Blair MacIntyre
259
6
0
30 Apr 2025
FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions
IEEE International Conference on Robotics and Automation (ICRA), 2025
Daniel Marta
Simon Holk
Miguel Vasco
Jens Lundell
Timon Homberger
F. L. Busch
Olov Andersson
Jens Lundell
Iolanda Leite
393
2
0
14 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
353
29
0
12 Apr 2025
Latent Embedding Adaptation for Human Preference Alignment in Diffusion Planners
IEEE International Conference on Robotics and Automation (ICRA), 2025
Wen Zheng Terence Ng
Jianda Chen
Yuan Xu
Tianwei Zhang
374
0
0
24 Mar 2025
OvercookedV2: Rethinking Overcooked for Zero-Shot Coordination
International Conference on Learning Representations (ICLR), 2025
Tobias Gessler
Tin Dizdarevic
Ani Calinescu
Benjamin Ellis
Andrei Lupu
Jakob Foerster
344
4
0
22 Mar 2025
Generating Robot Constitutions & Benchmarks for Semantic Safety
P. Sermanet
Anirudha Majumdar
A. Irpan
Dmitry Kalashnikov
Vikas Sindhwani
LM&Ro
405
11
0
11 Mar 2025
The Impact of VR and 2D Interfaces on Human Feedback in Preference-Based Robot Learning
Jorge de Heuvel
Daniel Marta
Simon Holk
Iolanda Leite
Maren Bennewitz
348
2
0
11 Mar 2025
Research on Superalignment Should Advance Now with Parallel Optimization of Competence and Conformity
HyunJin Kim
Xiaoyuan Yi
Jing Yao
Muhua Huang
Jinyeong Bak
James Evans
Xing Xie
306
0
0
08 Mar 2025
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
International Conference on Learning Representations (ICLR), 2025
Hyungkyu Kang
Min-hwan Oh
OffRL
332
2
0
07 Mar 2025
Human Implicit Preference-Based Policy Fine-tuning for Multi-Agent Reinforcement Learning in USV Swarm
Haksub Kim
Kanghoon Lee
Minjun Kim
Jiachen Li
Jinkyoo Park
425
4
0
05 Mar 2025
Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
Udita Ghosh
Dripta S. Raychaudhuri
Jiachen Li
Konstantinos Karydis
Amit K. Roy-Chowdhury
VLM
256
1
0
03 Feb 2025
TIMRL: A Novel Meta-Reinforcement Learning Framework for Non-Stationary and Multi-Task Environments
Chenyang Qi
Huiping Li
Panfeng Huang
OffRL
188
0
0
13 Jan 2025
Effects of Robot Competency and Motion Legibility on Human Correction Feedback
IEEE/ACM International Conference on Human-Robot Interaction (HRI), 2025
Shuangge Wang
Anjiabei Wang
Sofiya Goncharova
Brian Scassellati
Tesca Fitzgerald
268
3
0
08 Jan 2025
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Yueqin Yin
Shentao Yang
Yujia Xie
Ziyi Yang
Yuting Sun
Hany Awadalla
Weizhu Chen
Mingyuan Zhou
327
5
0
07 Jan 2025
Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning
Junlin Lu
Patrick Mannion
Karl Mason
224
2
0
30 Sep 2024
Generalizing Alignment Paradigm of Text-to-Image Generation with Preferences through
f
f
f
-divergence Minimization
AAAI Conference on Artificial Intelligence (AAAI), 2024
Haoyuan Sun
Bo Xia
Yongzhe Chang
Xueqian Wang
EGVM
249
19
0
15 Sep 2024
Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
IEEE International Conference on Robotics and Automation (ICRA), 2024
Z. Liu
Junjie Xu
Xingjiao Wu
J. Yang
Liang He
309
1
0
11 Sep 2024
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
International Conference on Machine Learning (ICML), 2024
Heewoong Choi
Sangwon Jung
Hongjoon Ahn
Taesup Moon
OffRL
280
10
0
08 Aug 2024
Offline Imitation Learning Through Graph Search and Retrieval
Zhao-Heng Yin
Pieter Abbeel
OffRL
220
10
0
22 Jul 2024
PECAN: Personalizing Robot Behaviors through a Learned Canonical Space
Heramb Nemlekar
Robert Ramirez Sanchez
Dylan P. Losey
344
4
0
22 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
352
37
0
06 Jul 2024
Safe MPC Alignment with Human Directional Feedback
Zhixian Xie
Wenlong Zhang
Yi Ren
Zhaoran Wang
George J. Pappas
Wanxin Jin
285
2
0
05 Jul 2024
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
Yixiao Wang
Yifei Zhang
Mingxiao Huo
Ran Tian
Xiang Zhang
...
Chenfeng Xu
Pengliang Ji
Wei Zhan
Mingyu Ding
Masayoshi Tomizuka
MoE
287
48
0
01 Jul 2024
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
Taiming Lu
Lingfeng Shen
Xinyu Yang
Weiting Tan
Beidi Chen
Huaxiu Yao
329
4
0
12 Jun 2024
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Adaptive Agents and Multi-Agent Systems (AAMAS), 2024
Calarina Muslimani
Bram Grooten
Deepak Ranganatha Sastry Mamillapalli
Mykola Pechenizkiy
Decebal Constantin Mocanu
Matthew E. Taylor
359
0
0
10 Jun 2024
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
Fengshuo Bai
Rui Zhao
Hongming Zhang
Sijia Cui
Ying Wen
Yaodong Yang
Bo Xu
Lei Han
OffRL
228
13
0
29 May 2024
Revision Matters: Generative Design Guided by Revision Edits
Tao Li
Chin-Yi Cheng
Amber Xie
Gang Li
Yang Li
218
3
0
27 May 2024
Leveraging Human Revisions for Improving Text-to-Layout Models
Amber Xie
Chin-Yi Cheng
Forrest Huang
Yang Li
235
1
0
16 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
408
3
0
30 Apr 2024
Hindsight PRIORs for Reward Learning from Human Preferences
Mudit Verma
Katherine Metcalf
263
10
0
12 Apr 2024
Regularized Conditional Diffusion Model for Multi-Task Preference Alignment
Xudong Yu
Chenjia Bai
Haoran He
Changhong Wang
Xuelong Li
341
8
0
07 Apr 2024
Learning Human Preferences Over Robot Behavior as Soft Planning Constraints
Austin Narcomey
Deyuan Li
Ruta Desai
Hao-Tien Lewis Chiang
305
4
0
28 Mar 2024
LORD: Large Models based Opposite Reward Design for Autonomous Driving
Xin Ye
Feng Tao
Abhirup Mallik
Burhaneddin Yaman
Liu Ren
OffRL
295
7
0
27 Mar 2024
Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks
Ziping Xu
Zifan Xu
Runxuan Jiang
Peter Stone
Ambuj Tewari
344
2
0
03 Mar 2024
Learning with Language-Guided State Abstractions
Andi Peng
Ilia Sucholutsky
Belinda Z. Li
T. Sumers
Thomas Griffiths
Jacob Andreas
Julie A. Shah
LM&Ro
273
16
0
28 Feb 2024
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Jie Cheng
Gang Xiong
Xingyuan Dai
Qinghai Miao
Yisheng Lv
Fei-Yue Wang
313
32
0
27 Feb 2024
1
2
Next