ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.03026
  4. Cited By
B-Pref: Benchmarking Preference-Based Reinforcement Learning

B-Pref: Benchmarking Preference-Based Reinforcement Learning

4 November 2021
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
    OffRL
ArXivPDFHTML

Papers citing "B-Pref: Benchmarking Preference-Based Reinforcement Learning"

19 / 69 papers shown
Title
Exploiting Unlabeled Data for Feedback Efficient Human Preference based
  Reinforcement Learning
Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning
Mudit Verma
Siddhant Bhambri
Subbarao Kambhampati
12
4
0
17 Feb 2023
Direct Preference-based Policy Optimization without Reward Modeling
Direct Preference-based Policy Optimization without Reward Modeling
Gaon An
Junhyeok Lee
Xingdong Zuo
Norio Kosaka
KyungHyun Kim
Hyun Oh Song
OffRL
19
24
0
30 Jan 2023
Reinforcement Learning from Diverse Human Preferences
Reinforcement Learning from Diverse Human Preferences
Wanqi Xue
Bo An
Shuicheng Yan
Zhongwen Xu
14
21
0
27 Jan 2023
On The Fragility of Learned Reward Functions
On The Fragility of Learned Reward Functions
Lev McKinney
Yawen Duan
David M. Krueger
Adam Gleave
23
19
0
09 Jan 2023
Few-Shot Preference Learning for Human-in-the-Loop RL
Few-Shot Preference Learning for Human-in-the-Loop RL
Joey Hejna
Dorsa Sadigh
OffRL
13
88
0
06 Dec 2022
Efficient Meta Reinforcement Learning for Preference-based Fast
  Adaptation
Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation
Zhizhou Ren
Anji Liu
Yitao Liang
Jian-wei Peng
Jianzhu Ma
27
9
0
20 Nov 2022
Rewards Encoding Environment Dynamics Improves Preference-based
  Reinforcement Learning
Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Katherine Metcalf
Miguel Sarabia
B. Theobald
OffRL
9
4
0
12 Nov 2022
The Expertise Problem: Learning from Specialized Feedback
The Expertise Problem: Learning from Specialized Feedback
Oliver Daniels-Koch
Rachel Freedman
OffRL
15
16
0
12 Nov 2022
Reward Learning with Trees: Methods and Evaluation
Reward Learning with Trees: Methods and Evaluation
Tom Bewley
J. Lawry
Arthur G. Richards
R. Craddock
Ian Henderson
18
1
0
03 Oct 2022
Transformers are Adaptable Task Planners
Transformers are Adaptable Task Planners
Vidhi Jain
Yixin Lin
Eric Undersander
Yonatan Bisk
Akshara Rai
13
24
0
06 Jul 2022
Humans are not Boltzmann Distributions: Challenges and Opportunities for
  Modelling Human Feedback and Interaction in Reinforcement Learning
Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning
David Lindner
Mennatallah El-Assady
OffRL
17
16
0
27 Jun 2022
Models of human preference for learning reward functions
Models of human preference for learning reward functions
W. B. Knox
Stephane Hatgis-Kessell
Serena Booth
S. Niekum
Peter Stone
A. Allievi
19
40
0
05 Jun 2022
Non-Markovian Reward Modelling from Trajectory Labels via Interpretable
  Multiple Instance Learning
Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance Learning
Joseph Early
Tom Bewley
C. Evers
Sarvapali Ramchurn
OffRL
8
14
0
30 May 2022
Reward Uncertainty for Exploration in Preference-based Reinforcement
  Learning
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning
Xinran Liang
Katherine Shu
Kimin Lee
Pieter Abbeel
14
57
0
24 May 2022
Causal Confusion and Reward Misidentification in Preference-Based Reward
  Learning
Causal Confusion and Reward Misidentification in Preference-Based Reward Learning
J. Tien
Jerry Zhi-Yang He
Zackory M. Erickson
Anca Dragan
Daniel S. Brown
CML
28
39
0
13 Apr 2022
SURF: Semi-supervised Reward Learning with Data Augmentation for
  Feedback-efficient Preference-based Reinforcement Learning
SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning
Jongjin Park
Younggyo Seo
Jinwoo Shin
Honglak Lee
Pieter Abbeel
Kimin Lee
7
81
0
18 Mar 2022
URLB: Unsupervised Reinforcement Learning Benchmark
URLB: Unsupervised Reinforcement Learning Benchmark
Michael Laskin
Denis Yarats
Hao Liu
Kimin Lee
Albert Zhan
Kevin Lu
Catherine Cang
Lerrel Pinto
Pieter Abbeel
SSL
OffRL
19
131
0
28 Oct 2021
Decoupling Representation Learning from Reinforcement Learning
Decoupling Representation Learning from Reinforcement Learning
Adam Stooke
Kimin Lee
Pieter Abbeel
Michael Laskin
SSL
DRL
268
337
0
14 Sep 2020
Dropout as a Bayesian Approximation: Representing Model Uncertainty in
  Deep Learning
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
247
9,042
0
06 Jun 2015
Previous
12