Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.04850
Cited By
Dueling RL: Reinforcement Learning with Trajectory Preferences
8 November 2021
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dueling RL: Reinforcement Learning with Trajectory Preferences"
11 / 61 papers shown
Title
Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems
Xiang Ji
Huazheng Wang
Minshuo Chen
Tuo Zhao
Mengdi Wang
OffRL
32
6
0
24 Jul 2023
Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning
Akash Velu
Skanda Vaidyanath
Dilip Arumugam
OffRL
20
1
0
21 Jul 2023
Is RLHF More Difficult than Standard RL?
Yuanhao Wang
Qinghua Liu
Chi Jin
OffRL
17
57
0
25 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
58
3,319
0
29 May 2023
Provable Reward-Agnostic Preference-Based Reinforcement Learning
Wenhao Zhan
Masatoshi Uehara
Wen Sun
Jason D. Lee
19
7
0
29 May 2023
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism
Zihao Li
Zhuoran Yang
Mengdi Wang
OffRL
29
54
0
29 May 2023
Provable Offline Preference-Based Reinforcement Learning
Wenhao Zhan
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
35
24
0
24 May 2023
Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative Feedback
Han Shao
Lee Cohen
Avrim Blum
Yishay Mansour
Aadirupa Saha
Matthew R. Walter
OffRL
14
4
0
07 Feb 2023
Principled Reinforcement Learning with Human Feedback from Pairwise or
K
K
K
-wise Comparisons
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
28
180
0
26 Jan 2023
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward
Tengyu Xu
Yue Wang
Shaofeng Zou
Yingbin Liang
OffRL
25
12
0
13 Jun 2022
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
Xiaoyu Chen
Han Zhong
Zhuoran Yang
Zhaoran Wang
Liwei Wang
118
60
0
23 May 2022
Previous
1
2