Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.06519
Cited By
The Expertise Problem: Learning from Specialized Feedback
12 November 2022
Oliver Daniels-Koch
Rachel Freedman
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Expertise Problem: Learning from Specialized Feedback"
5 / 5 papers shown
Title
Pareto-Optimal Learning from Preferences with Hidden Context
Ryan Boldi
Li Ding
Lee Spector
S. Niekum
59
6
0
21 Jun 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
M. E. Taylor
OffRL
38
2
0
30 Apr 2024
Active teacher selection for reinforcement learning from human feedback
Rachel Freedman
Justin Svegliato
K. H. Wray
Stuart J. Russell
26
6
0
23 Oct 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,583
0
18 Sep 2019
1