Few-Shot Preference Learning for Human-in-the-Loop RL

Conference on Robot Learning (CoRL), 2022

6 December 2022

Joey Hejna

Dorsa Sadigh

OffRL

ArXiv (abs)PDF HTML

Papers citing "Few-Shot Preference Learning for Human-in-the-Loop RL"

50 / 72 papers shown

Safe and Optimal Learning from Preferences via Weighted Temporal Logic with Applications in Robotics and Formula 1

Ruya Karagulle

Cristian-Ioan Vasile

N. Ozay

11 Nov 2025

ARMADA: Autonomous Online Failure Detection and Human Shared Control Empower Scalable Real-world Deployment and Adaptation

135

02 Oct 2025

STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning

102

28 Sep 2025

Preference-Guided Learning for Sparse-Reward Multi-Agent Reinforcement Learning

176

26 Sep 2025

Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers

Zahra Aref

Narayan B. Mandayam

OffRL

112

19 Sep 2025

Interaction-Driven Browsing: A Human-in-the-Loop Conceptual Framework Informed by Human Web Browsing for Browser-Using Agents

Hyeonggeun Yun

Jinkyu Jang

152

15 Sep 2025

Learning Real-World Acrobatic Flight from Human Preferences

124

26 Aug 2025

In-situ Value-aligned Human-Robot Interactions with Physical Constraints

125

11 Aug 2025

Can you see how I learn? Human observers' inferences about Reinforcement Learning agents' learning processesAdaptive Agents and Multi-Agent Systems (AAMAS), 2025

125

16 Jun 2025

Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models

199

15 Jun 2025

Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning

Sara Rajaram

R. J. Cotton

Fabian H. Sinz

181

14 Jun 2025

MisoDICE: Multi-Agent Imitation from Unlabeled Mixed-Quality Demonstrations

200

24 May 2025

ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations

Jiahui Zhang

Yusen Luo

Abrar Anwar

Sumedh Anand Sontakke

422

16 May 2025

Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning

...

473

12 May 2025

Adaptive 3D UI Placement in Mixed Reality Using Deep Reinforcement Learning

259

30 Apr 2025

FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward FunctionsIEEE International Conference on Robotics and Automation (ICRA), 2025

393

14 Apr 2025

A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future

353

12 Apr 2025

Latent Embedding Adaptation for Human Preference Alignment in Diffusion PlannersIEEE International Conference on Robotics and Automation (ICRA), 2025

374

24 Mar 2025

OvercookedV2: Rethinking Overcooked for Zero-Shot CoordinationInternational Conference on Learning Representations (ICLR), 2025

344

22 Mar 2025

Generating Robot Constitutions & Benchmarks for Semantic Safety

405

11 Mar 2025

The Impact of VR and 2D Interfaces on Human Feedback in Preference-Based Robot Learning

348

11 Mar 2025

Research on Superalignment Should Advance Now with Parallel Optimization of Competence and Conformity

306

08 Mar 2025

Adversarial Policy Optimization for Offline Preference-based Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2025

Hyungkyu Kang

Min-hwan Oh

OffRL

332

07 Mar 2025

Human Implicit Preference-Based Policy Fine-tuning for Multi-Agent Reinforcement Learning in USV Swarm

425

05 Mar 2025

Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning

Udita Ghosh

Dripta S. Raychaudhuri

Jiachen Li

Konstantinos Karydis

Amit K. Roy-Chowdhury

VLM

256

03 Feb 2025

TIMRL: A Novel Meta-Reinforcement Learning Framework for Non-Stationary and Multi-Task Environments

188

13 Jan 2025

Effects of Robot Competency and Motion Legibility on Human Correction FeedbackIEEE/ACM International Conference on Human-Robot Interaction (HRI), 2025

268

08 Jan 2025

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

327

07 Jan 2025

Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning

Junlin Lu

Patrick Mannion

Karl Mason

224

30 Sep 2024

Generalizing Alignment Paradigm of Text-to-Image Generation with Preferences through

f

-divergence MinimizationAAAI Conference on Artificial Intelligence (AAAI), 2024

Haoyuan Sun

Bo Xia

Yongzhe Chang

Xueqian Wang

EGVM

249

15 Sep 2024

Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal PreferencesIEEE International Conference on Robotics and Automation (ICRA), 2024

Liang He

309

11 Sep 2024

Listwise Reward Estimation for Offline Preference-based Reinforcement LearningInternational Conference on Machine Learning (ICML), 2024

Sangwon Jung

280

08 Aug 2024

Offline Imitation Learning Through Graph Search and Retrieval

Zhao-Heng Yin

Pieter Abbeel

OffRL

220

22 Jul 2024

PECAN: Personalizing Robot Behaviors through a Learned Canonical Space

Heramb Nemlekar

Robert Ramirez Sanchez

Dylan P. Losey

344

22 Jul 2024

AI Safety in Generative AI Large Language Models: A Survey

Lina Yao

352

06 Jul 2024

Safe MPC Alignment with Human Directional Feedback

Zhixian Xie

Wenlong Zhang

Yi Ren

Zhaoran Wang

George J. Pappas

Wanxin Jin

285

05 Jul 2024

Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning

Xiang Zhang

...

Wei Zhan

287

01 Jul 2024

It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF

Xinyu Yang

329

12 Jun 2024

Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic SparsityAdaptive Agents and Multi-Agent Systems (AAMAS), 2024

Calarina Muslimani

Bram Grooten

Deepak Ranganatha Sastry Mamillapalli

Mykola Pechenizkiy

Decebal Constantin Mocanu

Matthew E. Taylor

359

10 Jun 2024

Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation

Bo Xu

228

29 May 2024

Revision Matters: Generative Design Guided by Revision Edits

218

27 May 2024

Leveraging Human Revisions for Improving Text-to-Layout Models

235

16 May 2024

Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning

Calarina Muslimani

Matthew E. Taylor

OffRL

408

30 Apr 2024

Hindsight PRIORs for Reward Learning from Human Preferences

Mudit Verma

Katherine Metcalf

263

12 Apr 2024

Regularized Conditional Diffusion Model for Multi-Task Preference Alignment

Xuelong Li

341

07 Apr 2024

Learning Human Preferences Over Robot Behavior as Soft Planning Constraints

Austin Narcomey

Deyuan Li

Ruta Desai

Hao-Tien Lewis Chiang

305

28 Mar 2024

LORD: Large Models based Opposite Reward Design for Autonomous Driving

295

27 Mar 2024

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

344

03 Mar 2024

Learning with Language-Guided State Abstractions

273

28 Feb 2024

RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences

Jie Cheng

Gang Xiong

Xingyuan Dai

Qinghai Miao

Yisheng Lv

Fei-Yue Wang

313

27 Feb 2024