Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.01289
Cited By
Dueling Posterior Sampling for Preference-Based Reinforcement Learning
4 August 2019
Ellen R. Novoseller
Yibing Wei
Yanan Sui
Yisong Yue
J. W. Burdick
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dueling Posterior Sampling for Preference-Based Reinforcement Learning"
13 / 13 papers shown
Title
Contextual Online Uncertainty-Aware Preference Learning for Human Feedback
Nan Lu
Ethan X. Fang
Junwei Lu
60
0
0
27 Apr 2025
Reinforcement Learning from Multi-level and Episodic Human Feedback
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
44
0
0
20 Apr 2025
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
74
0
0
26 Feb 2025
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
25
0
0
09 Jul 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
38
1
0
11 Jun 2024
Comparisons Are All You Need for Optimizing Smooth Functions
Chenyi Zhang
Tongyang Li
AAML
21
1
0
19 May 2024
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Banghua Zhu
Michael I. Jordan
Jiantao Jiao
21
22
0
29 Jan 2024
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Gokul Swamy
Christoph Dann
Rahul Kidambi
Zhiwei Steven Wu
Alekh Agarwal
OffRL
22
94
0
08 Jan 2024
A Learning-Based Framework for Safe Human-Robot Collaboration with Multiple Backup Control Barrier Functions
Neil C. Janwani
Ersin Daş
Thomas Touma
Skylar X. Wei
Tamas G. Molnar
J. W. Burdick
16
2
0
09 Oct 2023
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism
Zihao Li
Zhuoran Yang
Mengdi Wang
OffRL
16
51
0
29 May 2023
Learning from Imperfect Demonstrations via Adversarial Confidence Transfer
Zhangjie Cao
Zihan Wang
Dorsa Sadigh
AAML
19
7
0
07 Feb 2022
Early Detection of Combustion Instabilities using Deep Convolutional Selective Autoencoders on Hi-speed Flame Video
Chandrayee Basu
Qian Yang
M. Singhal
Anca Dragan
49
174
0
25 Mar 2016
Online Structured Prediction via Coactive Learning
Pannagadatta K. Shivaswamy
Thorsten Joachims
HAI
63
66
0
18 May 2012
1