Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.07382
Cited By
Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation
14 January 2024
Meng Cao
Lei Shu
Lei Yu
Yun Zhu
Nevan Wichers
Yinxiao Liu
Lei Meng
OffRL
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation"
5 / 5 papers shown
Title
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization
Mengyao Cao
Yue Dong
Jackie C.K. Cheung
HILM
170
144
0
30 Aug 2021
Hierarchical Reinforcement Learning By Discovering Intrinsic Options
Jesse Zhang
Haonan Yu
W. Xu
BDL
120
81
0
16 Jan 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,561
0
18 Sep 2019
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
192
1,325
0
05 Jun 2016
1