Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.15288
Cited By
Active teacher selection for reinforcement learning from human feedback
23 October 2023
Rachel Freedman
Justin Svegliato
K. H. Wray
Stuart J. Russell
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Active teacher selection for reinforcement learning from human feedback"
6 / 6 papers shown
Title
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Kai Ye
Hongyi Zhou
Jin Zhu
Francesco Quinzan
C. Shi
23
0
0
03 Apr 2025
When Can Proxies Improve the Sample Complexity of Preference Learning?
Yuchen Zhu
Daniel Augusto de Souza
Zhengyan Shi
Mengyue Yang
Pasquale Minervini
Alexander DÁmour
Matt J. Kusner
69
0
0
21 Dec 2024
Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
Vincent Conitzer
Rachel Freedman
J. Heitzig
Wesley H. Holliday
Bob M. Jacobs
...
Eric Pacuit
Stuart Russell
Hailey Schoelkopf
Emanuel Tewolde
W. Zwicker
31
28
0
16 Apr 2024
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
34
468
0
27 Jul 2023
Defining and Characterizing Reward Hacking
Joar Skalse
Nikolaus H. R. Howe
Dmitrii Krasheninnikov
David M. Krueger
57
53
0
27 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
1