Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.06527
Cited By
Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
12 November 2022
Katherine Metcalf
Miguel Sarabia
B. Theobald
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning"
2 / 2 papers shown
Title
DAPPER: Discriminability-Aware Policy-to-Policy Preference-Based Reinforcement Learning for Query-Efficient Robot Skill Acquisition
Yuki Kadokawa
Jonas Frey
Takahiro Miki
Takamitsu Matsubara
Marco Hutter
26
0
0
09 May 2025
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Dinesh Manocha
Huazheng Wang
Mengdi Wang
Furong Huang
23
25
0
03 Aug 2023
1