ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.12401
  4. Cited By
Reward Uncertainty for Exploration in Preference-based Reinforcement
  Learning

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

24 May 2022
Xinran Liang
Katherine Shu
Kimin Lee
Pieter Abbeel
ArXivPDFHTML

Papers citing "Reward Uncertainty for Exploration in Preference-based Reinforcement Learning"

42 / 42 papers shown
Title
TREND: Tri-teaching for Robust Preference-based Reinforcement Learning with Demonstrations
TREND: Tri-teaching for Robust Preference-based Reinforcement Learning with Demonstrations
Shuaiyi Huang
Mara Levy
Anubhav Gupta
Daniel Ekpo
Ruijie Zheng
Abhinav Shrivastava
19
0
0
09 May 2025
FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions
FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions
Daniel Marta
Simon Holk
Miguel Vasco
Jens Lundell
Timon Homberger
F. L. Busch
Olov Andersson
Danica Kragic
Iolanda Leite
33
0
0
14 Apr 2025
Efficient Process Reward Model Training via Active Learning
Efficient Process Reward Model Training via Active Learning
Keyu Duan
Zichen Liu
Xin Mao
Tianyu Pang
Changyu Chen
Qiguang Chen
Michael Shieh
Longxu Dou
LRM
25
1
0
14 Apr 2025
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
Hyungkyu Kang
Min-hwan Oh
OffRL
45
0
0
07 Mar 2025
Subtask-Aware Visual Reward Learning from Segmented Demonstrations
Subtask-Aware Visual Reward Learning from Segmented Demonstrations
Changyeon Kim
Minho Heo
Doohyun Lee
Jinwoo Shin
Honglak Lee
Joseph J. Lim
Kimin Lee
32
0
0
28 Feb 2025
LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency
LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency
Xiao-Yin Liu
Guotao Li
Xiao-Hu Zhou
Z. Hou
OffRL
34
0
0
31 Dec 2024
Comprehensive Overview of Reward Engineering and Shaping in Advancing Reinforcement Learning Applications
Comprehensive Overview of Reward Engineering and Shaping in Advancing Reinforcement Learning Applications
Sinan Ibrahim
Mostafa Mostafa
Ali Jnadi
Hadi Salloum
Pavel Osinenko
OffRL
32
12
0
31 Dec 2024
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted
  Behaviors
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
Fengshuo Bai
Runze Liu
Yali Du
Ying Wen
Yaodong Yang
AAML
78
2
0
14 Dec 2024
Learning Hidden Subgoals under Temporal Ordering Constraints in
  Reinforcement Learning
Learning Hidden Subgoals under Temporal Ordering Constraints in Reinforcement Learning
Duo Xu
Faramarz Fekri
OffRL
26
0
0
03 Nov 2024
Dual Action Policy for Robust Sim-to-Real Reinforcement Learning
Dual Action Policy for Robust Sim-to-Real Reinforcement Learning
Ng Wen Zheng Terence
Chen Jianda
15
0
0
16 Oct 2024
Multi-Type Preference Learning: Empowering Preference-Based
  Reinforcement Learning with Equal Preferences
Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
Z. Liu
Junjie Xu
Xingjiao Wu
J. Yang
Liang He
23
0
0
11 Sep 2024
Listwise Reward Estimation for Offline Preference-based Reinforcement
  Learning
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
Heewoong Choi
Sangwon Jung
Hongjoon Ahn
Taesup Moon
OffRL
34
2
0
08 Aug 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
25
0
0
09 Jul 2024
Hindsight Preference Learning for Offline Preference-based Reinforcement
  Learning
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
Chen-Xiao Gao
Shengjun Fang
Chenjun Xiao
Yang Yu
Zongzhang Zhang
OffRL
25
0
0
05 Jul 2024
Safety through feedback in Constrained RL
Safety through feedback in Constrained RL
Shashank Reddy Chirra
Pradeep Varakantham
P. Paruchuri
OffRL
35
1
0
28 Jun 2024
Boosting Robustness in Preference-Based Reinforcement Learning with
  Dynamic Sparsity
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
Calarina Muslimani
Bram Grooten
Deepak Ranganatha Sastry Mamillapalli
Mykola Pechenizkiy
D. Mocanu
M. E. Taylor
43
0
0
10 Jun 2024
Efficient Preference-based Reinforcement Learning via Aligned Experience
  Estimation
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
Fengshuo Bai
Rui Zhao
Hongming Zhang
Sijia Cui
Ying Wen
Yaodong Yang
Bo Xu
Lei Han
OffRL
16
6
0
29 May 2024
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
JoonHo Lee
Jae Oh Woo
Juree Seok
Parisa Hassanzadeh
Wooseok Jang
...
Hankyu Moon
Wenjun Hu
Yeong-Dae Kwon
Taehee Lee
Seungjai Min
40
2
0
10 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
M. E. Taylor
OffRL
38
2
0
30 Apr 2024
Impact of Preference Noise on the Alignment Performance of Generative
  Language Models
Impact of Preference Noise on the Alignment Performance of Generative Language Models
Yang Gao
Dana Alon
Donald Metzler
21
15
0
15 Apr 2024
Hindsight PRIORs for Reward Learning from Human Preferences
Hindsight PRIORs for Reward Learning from Human Preferences
Mudit Verma
Katherine Metcalf
35
5
0
12 Apr 2024
Sample-Efficient Preference-based Reinforcement Learning with Dynamics
  Aware Rewards
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
17
5
0
28 Feb 2024
RIME: Robust Preference-based Reinforcement Learning with Noisy
  Preferences
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Jie Cheng
Gang Xiong
Xingyuan Dai
Q. Miao
Yisheng Lv
Fei-Yue Wang
26
14
0
27 Feb 2024
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement
  Learning with Diverse Human Feedback
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
Yifu Yuan
Jianye Hao
Yi-An Ma
Zibin Dong
Hebin Liang
Jinyi Liu
Zhixin Feng
Kai-Wen Zhao
Yan Zheng
OffRL
ALM
11
14
0
04 Feb 2024
Resilient Constrained Reinforcement Learning
Resilient Constrained Reinforcement Learning
Dongsheng Ding
Zhengyan Huan
Alejandro Ribeiro
16
1
0
28 Dec 2023
Human-AI Collaboration in Real-World Complex Environment with
  Reinforcement Learning
Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning
Md Saiful Islam
Srijita Das
S. Gottipati
William Duguay
Clodéric Mars
Jalal Arabneydi
Antoine Fagette
Matthew J. Guzdial
Matthew E. Taylor
23
1
0
23 Dec 2023
Promptable Behaviors: Personalizing Multi-Objective Rewards from Human
  Preferences
Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences
Minyoung Hwang
Luca Weihs
Chanwoo Park
Kimin Lee
Aniruddha Kembhavi
Kiana Ehsani
22
18
0
14 Dec 2023
Reinforcement Learning from Statistical Feedback: the Journey from AB
  Testing to ANT Testing
Reinforcement Learning from Statistical Feedback: the Journey from AB Testing to ANT Testing
Feiyang Han
Yimin Wei
Zhaofeng Liu
Yanxing Qi
25
1
0
24 Nov 2023
Rating-based Reinforcement Learning
Rating-based Reinforcement Learning
Devin White
Mingkang Wu
Ellen R. Novoseller
Vernon J. Lawhern
Nicholas R. Waytowich
Yongcan Cao
ALM
11
6
0
30 Jul 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
34
468
0
27 Jul 2023
STRAPPER: Preference-based Reinforcement Learning via Self-training
  Augmentation and Peer Regularization
STRAPPER: Preference-based Reinforcement Learning via Self-training Augmentation and Peer Regularization
Yachen Kang
Li He
Jinxin Liu
Zifeng Zhuang
Donglin Wang
15
0
0
19 Jul 2023
Boosting Feedback Efficiency of Interactive Reinforcement Learning by
  Adaptive Learning from Scores
Boosting Feedback Efficiency of Interactive Reinforcement Learning by Adaptive Learning from Scores
Shukai Liu
Chenming Wu
Ying Li
Liang Zhang
16
0
0
11 Jul 2023
Proportional Aggregation of Preferences for Sequential Decision Making
Proportional Aggregation of Preferences for Sequential Decision Making
Nikhil Chandak
Shashwat Goel
Dominik Peters
14
9
0
26 Jun 2023
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward
  Learning for Robotic Manipulation
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
Runze Liu
Yali Du
Fengshuo Bai
Jiafei Lyu
Xiu Li
9
6
0
06 Jun 2023
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Xiao Hu
Jianxiong Li
Xianyuan Zhan
Qing-Shan Jia
Ya-Qin Zhang
11
8
0
27 May 2023
Preference Transformer: Modeling Human Preferences using Transformers
  for RL
Preference Transformer: Modeling Human Preferences using Transformers for RL
Changyeon Kim
Jongjin Park
Jinwoo Shin
Honglak Lee
Pieter Abbeel
Kimin Lee
OffRL
20
60
0
02 Mar 2023
Active Reward Learning from Multiple Teachers
Active Reward Learning from Multiple Teachers
Peter Barnett
Rachel Freedman
Justin Svegliato
Stuart J. Russell
17
14
0
02 Mar 2023
Reinforcement Learning from Diverse Human Preferences
Reinforcement Learning from Diverse Human Preferences
Wanqi Xue
Bo An
Shuicheng Yan
Zhongwen Xu
14
21
0
27 Jan 2023
Few-Shot Preference Learning for Human-in-the-Loop RL
Few-Shot Preference Learning for Human-in-the-Loop RL
Joey Hejna
Dorsa Sadigh
OffRL
13
88
0
06 Dec 2022
Efficient Meta Reinforcement Learning for Preference-based Fast
  Adaptation
Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation
Zhizhou Ren
Anji Liu
Yitao Liang
Jian-wei Peng
Jianzhu Ma
27
9
0
20 Nov 2022
Skill-Based Reinforcement Learning with Intrinsic Reward Matching
Skill-Based Reinforcement Learning with Intrinsic Reward Matching
Ademi Adeniji
Amber Xie
Pieter Abbeel
OffRL
17
5
0
14 Oct 2022
Transformers are Adaptable Task Planners
Transformers are Adaptable Task Planners
Vidhi Jain
Yixin Lin
Eric Undersander
Yonatan Bisk
Akshara Rai
13
24
0
06 Jul 2022
1