ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.07382
  4. Cited By
Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language
  Model Critique in Text Generation

Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation

14 January 2024
Meng Cao
Lei Shu
Lei Yu
Yun Zhu
Nevan Wichers
Yinxiao Liu
Lei Meng
    OffRL
    ALM
ArXivPDFHTML

Papers citing "Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation"

5 / 5 papers shown
Title
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in
  Abstractive Summarization
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization
Mengyao Cao
Yue Dong
Jackie C.K. Cheung
HILM
170
144
0
30 Aug 2021
Hierarchical Reinforcement Learning By Discovering Intrinsic Options
Hierarchical Reinforcement Learning By Discovering Intrinsic Options
Jesse Zhang
Haonan Yu
W. Xu
BDL
118
81
0
16 Jan 2021
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,561
0
18 Sep 2019
Deep Reinforcement Learning for Dialogue Generation
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
192
1,325
0
05 Jun 2016
1