Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.10075
Cited By
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning
19 August 2024
S. Poddar
Yanming Wan
Hamish Ivison
Abhishek Gupta
Natasha Jaques
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning"
10 / 10 papers shown
Title
Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Kunal Jha
Wilka Carvalho
Yancheng Liang
S. Du
Max Kleiman-Weiner
Natasha Jaques
17
0
0
17 Apr 2025
A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications
Jian-Yu Guan
J. Wu
J. Li
Chuanqi Cheng
Wei Yu Wu
LM&MA
69
0
0
21 Mar 2025
ConsistencyTrack: A Robust Multi-Object Tracker with a Generation Strategy of Consistency Model
Lifan Jiang
Zhihui Wang
Siqi Yin
Guangxiao Ma
Peng Zhang
Boxi Wu
DiffM
37
2
0
28 Aug 2024
Pareto-Optimal Learning from Preferences with Hidden Context
Ryan Boldi
Li Ding
Lee Spector
S. Niekum
33
6
0
21 Jun 2024
A Roadmap to Pluralistic Alignment
Taylor Sorensen
Jared Moore
Jillian R. Fisher
Mitchell L. Gordon
Niloofar Mireshghallah
...
Liwei Jiang
Ximing Lu
Nouha Dziri
Tim Althoff
Yejin Choi
57
75
0
07 Feb 2024
Personalized Language Modeling from Personalized Human Feedback
Xinyu Li
Zachary C. Lipton
Liu Leqi
ALM
55
11
0
06 Feb 2024
Crowd-PrefRL: Preference-Based Reward Learning from Crowds
David Chhan
Ellen R. Novoseller
Vernon J. Lawhern
21
5
0
17 Jan 2024
vec2text with Round-Trip Translations
Geoffrey Cideron
Sertan Girgin
Anton Raichuk
Olivier Pietquin
Olivier Bachem
Léonard Hussenot
29
3
0
14 Sep 2022
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
193
627
0
12 Oct 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
273
1,561
0
18 Sep 2019
1