Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.03026
Cited By
B-Pref: Benchmarking Preference-Based Reinforcement Learning
4 November 2021
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"B-Pref: Benchmarking Preference-Based Reinforcement Learning"
50 / 69 papers shown
Title
TREND: Tri-teaching for Robust Preference-based Reinforcement Learning with Demonstrations
Shuaiyi Huang
Mara Levy
Anubhav Gupta
Daniel Ekpo
Ruijie Zheng
Abhinav Shrivastava
23
0
0
09 May 2025
Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model
Songjun Tu
Jingbo Sun
Qichao Zhang
Xiangyuan Lan
Dongbin Zhao
67
1
0
22 Dec 2024
Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models
Muhan Lin
Shuyang Shi
Yue (Sophie) Guo
Behdad Chalaki
Vaishnav Tadiparthi
Ehsan Moradi-Pari
Simon Stepputtis
Joseph Campbell
Katia P. Sycara
33
1
0
22 Oct 2024
UNIQ: Offline Inverse Q-learning for Avoiding Undesirable Demonstrations
Huy Hoang
Tien Mai
Pradeep Varakantham
OffRL
22
0
0
10 Oct 2024
Reinforcement Learning From Imperfect Corrective Actions And Proxy Rewards
Zhaohui Jiang
Xuening Feng
Paul Weng
Yifei Zhu
Yan Song
Tianze Zhou
Yujing Hu
Tangjie Lv
Changjie Fan
31
0
0
08 Oct 2024
Human-Robot Cooperative Distribution Coupling for Hamiltonian-Constrained Social Navigation
Weizheng Wang
Chao Yu
Yu Wang
Byung-Cheol Min
57
2
0
20 Sep 2024
Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
Z. Liu
Junjie Xu
Xingjiao Wu
J. Yang
Liang He
26
0
0
11 Sep 2024
Forward KL Regularized Preference Optimization for Aligning Diffusion Policies
Zhao Shan
Chenyou Fan
Shuang Qiu
Jiyuan Shi
Chenjia Bai
33
4
0
09 Sep 2024
ELO-Rated Sequence Rewards: Advancing Reinforcement Learning Models
Qi Ju
Falin Hei
Zhemei Fang
Yunfeng Luo
14
0
0
05 Sep 2024
Advances in Preference-based Reinforcement Learning: A Review
Youssef Abdelkareem
Shady Shehata
Fakhri Karray
OffRL
35
9
0
21 Aug 2024
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
Heewoong Choi
Sangwon Jung
Hongjoon Ahn
Taesup Moon
OffRL
34
2
0
08 Aug 2024
Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control
Huayu Chen
Kaiwen Zheng
Hang Su
Jun Zhu
32
1
0
12 Jul 2024
Robust Reinforcement Learning from Corrupted Human Feedback
Alexander Bukharin
Ilgee Hong
Haoming Jiang
Zichong Li
Qingru Zhang
Zixuan Zhang
Tuo Zhao
31
4
0
21 Jun 2024
Pareto-Optimal Learning from Preferences with Hidden Context
Ryan Boldi
Li Ding
Lee Spector
S. Niekum
54
6
0
21 Jun 2024
RRLS : Robust Reinforcement Learning Suite
Adil Zouitine
David Bertoin
Pierre Clavier
Matthieu Geist
Emmanuel Rachelson
OffRL
22
1
0
12 Jun 2024
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Calarina Muslimani
Bram Grooten
Deepak Ranganatha Sastry Mamillapalli
Mykola Pechenizkiy
D. Mocanu
M. E. Taylor
43
0
0
10 Jun 2024
Direct Preference Optimization With Unobserved Preference Heterogeneity
Keertana Chidambaram
Karthik Vinay Seetharaman
Vasilis Syrgkanis
36
7
0
23 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
M. E. Taylor
OffRL
38
2
0
30 Apr 2024
Optimal Design for Human Feedback
Subhojyoti Mukherjee
Anusha Lalitha
Kousha Kalantari
Aniket Deshmukh
Ge Liu
Yifei Ma
B. Kveton
31
0
0
22 Apr 2024
Impact of Preference Noise on the Alignment Performance of Generative Language Models
Yang Gao
Dana Alon
Donald Metzler
21
15
0
15 Apr 2024
Hindsight PRIORs for Reward Learning from Human Preferences
Mudit Verma
Katherine Metcalf
35
5
0
12 Apr 2024
Learning Human Preferences Over Robot Behavior as Soft Planning Constraints
Austin Narcomey
Nathan Tsoi
Ruta Desai
Marynel Vázquez
36
3
0
28 Mar 2024
Online Policy Learning from Offline Preferences
Guoxi Zhang
Han Bao
Hisashi Kashima
OffRL
27
0
0
15 Mar 2024
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
19
5
0
28 Feb 2024
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Jie Cheng
Gang Xiong
Xingyuan Dai
Q. Miao
Yisheng Lv
Fei-Yue Wang
26
14
0
27 Feb 2024
Batch Active Learning of Reward Functions from Human Preferences
Erdem Biyik
Nima Anari
Dorsa Sadigh
27
6
0
24 Feb 2024
MENTOR: Guiding Hierarchical Reinforcement Learning with Human Feedback and Dynamic Distance Constraint
Xinglin Zhou
Yifu Yuan
Shaofu Yang
Jianye Hao
21
1
0
22 Feb 2024
Corruption Robust Offline Reinforcement Learning with Human Feedback
Debmalya Mandal
Andi Nika
Parameswaran Kamalaruban
Adish Singla
Goran Radanović
OffRL
28
8
0
09 Feb 2024
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Yufei Wang
Zhanyi Sun
Jesse Zhang
Zhou Xian
Erdem Biyik
David Held
Zackory M. Erickson
VLM
53
48
0
06 Feb 2024
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
Yifu Yuan
Jianye Hao
Yi-An Ma
Zibin Dong
Hebin Liang
Jinyi Liu
Zhixin Feng
Kai-Wen Zhao
Yan Zheng
OffRL
ALM
11
14
0
04 Feb 2024
Crowd-PrefRL: Preference-Based Reward Learning from Crowds
David Chhan
Ellen R. Novoseller
Vernon J. Lawhern
27
5
0
17 Jan 2024
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Gokul Swamy
Christoph Dann
Rahul Kidambi
Zhiwei Steven Wu
Alekh Agarwal
OffRL
28
94
0
08 Jan 2024
Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF
Anand Siththaranjan
Cassidy Laidlaw
Dylan Hadfield-Menell
13
54
0
13 Dec 2023
Agent-Aware Training for Agent-Agnostic Action Advising in Deep Reinforcement Learning
Yaoquan Wei
Shunyu Liu
Jie Song
Tongya Zheng
Kaixuan Chen
Yong Wang
Mingli Song
11
0
0
28 Nov 2023
Autonomous Robotic Reinforcement Learning with Asynchronous Human Feedback
Max Balsells
M. Torné
Zihan Wang
Samedh Desai
Pulkit Agrawal
Abhishek Gupta
37
10
0
31 Oct 2023
Learning Reward for Physical Skills using Large Language Model
Yuwei Zeng
Yiqing Xu
23
6
0
21 Oct 2023
Dynamic value alignment through preference aggregation of multiple objectives
Marcin Korecki
Damian Dailisan
Cesare Carissimo
20
0
0
09 Oct 2023
Learning Optimal Advantage from Preferences and Mistaking it for Reward
W. B. Knox
Stephane Hatgis-Kessell
Sigurdur O. Adalgeirsson
Serena Booth
Anca D. Dragan
Peter Stone
S. Niekum
14
12
0
03 Oct 2023
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov
P. DÓro
Shagun Sodhani
Roberta Raileanu
Pierre-Luc Bacon
Pascal Vincent
Amy Zhang
Mikael Henaff
LRM
LLMAG
16
54
0
29 Sep 2023
Rating-based Reinforcement Learning
Devin White
Mingkang Wu
Ellen R. Novoseller
Vernon J. Lawhern
Nicholas R. Waytowich
Yongcan Cao
ALM
11
6
0
30 Jul 2023
STRAPPER: Preference-based Reinforcement Learning via Self-training Augmentation and Peer Regularization
Yachen Kang
Li He
Jinxin Liu
Zifeng Zhuang
Donglin Wang
20
0
0
19 Jul 2023
Boosting Feedback Efficiency of Interactive Reinforcement Learning by Adaptive Learning from Scores
Shukai Liu
Chenming Wu
Ying Li
Liang Zhang
16
0
0
11 Jul 2023
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
Runze Liu
Yali Du
Fengshuo Bai
Jiafei Lyu
Xiu Li
11
6
0
06 Jun 2023
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Xiao Hu
Jianxiong Li
Xianyuan Zhan
Qing-Shan Jia
Ya-Qin Zhang
11
8
0
27 May 2023
Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback
Tom Bewley
J. Lawry
Arthur G. Richards
25
1
0
26 May 2023
Inverse Preference Learning: Preference-based RL without a Reward Function
Joey Hejna
Dorsa Sadigh
OffRL
19
47
0
24 May 2023
Maximum Causal Entropy Inverse Constrained Reinforcement Learning
Mattijs Baert
Pietro Mazzaglia
Sam Leroux
Pieter Simoens
CML
27
10
0
04 May 2023
NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning
Weizheng Wang
Ruiqi Wang
Le Mao
Byung-Cheol Min
32
13
0
12 Apr 2023
Preference Transformer: Modeling Human Preferences using Transformers for RL
Changyeon Kim
Jongjin Park
Jinwoo Shin
Honglak Lee
Pieter Abbeel
Kimin Lee
OffRL
23
60
0
02 Mar 2023
Complex QA and language models hybrid architectures, Survey
Xavier Daull
P. Bellot
Emmanuel Bruno
Vincent Martin
Elisabeth Murisasco
ELM
19
15
0
17 Feb 2023
1
2
Next