Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.10704
Cited By
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
15 March 2024
Hakim Sidahmed
Samrat Phatale
Alex Hutcheson
Zhuonan Lin
Zhan Chen
Zac Yu
Jarvis Jin
Roman Komarytsia
Christiane Ahlheim
Yonghao Zhu
Simral Chaudhary
Bowen Li
Saravanan Ganesh
Bill Byrne
Jessica Hoffmann
Hassan Mansoor
Wei Li
Abhinav Rastogi
Lucas Dixon
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PERL: Parameter Efficient Reinforcement Learning from Human Feedback"
6 / 6 papers shown
Title
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
97
92
0
22 Jan 2024
Aligning Large Language Models through Synthetic Feedback
Sungdong Kim
Sanghwan Bae
Jamin Shin
Soyoung Kang
Donghyun Kwak
Kang Min Yoo
Minjoon Seo
ALM
SyDa
73
67
0
23 May 2023
Teaching language models to support answers with verified quotes
Jacob Menick
Maja Trebacz
Vladimir Mikulik
John Aslanides
Francis Song
...
Mia Glaese
Susannah Young
Lucy Campbell-Gillingham
G. Irving
Nat McAleese
ELM
RALM
226
255
0
21 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Understanding Dataset Difficulty with
V
\mathcal{V}
V
-Usable Information
Kawin Ethayarajh
Yejin Choi
Swabha Swayamdipta
154
157
0
16 Oct 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
1