ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.02511
  4. Cited By
Offline Reinforcement Learning from Human Feedback in Real-World
  Sequence-to-Sequence Tasks

Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks

4 November 2020
Julia Kreutzer
Stefan Riezler
Carolin (Haas) Lawrence
    RALM
    OffRL
ArXivPDFHTML

Papers citing "Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks"

8 / 8 papers shown
Title
Participation in the age of foundation models
Participation in the age of foundation models
Harini Suresh
Emily Tseng
Meg Young
Mary L. Gray
Emma Pierson
Karen Levy
46
20
0
29 May 2024
Reinforcement Learning for Generative AI: A Survey
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
53
10
0
28 Aug 2023
Is Reinforcement Learning (Not) for Natural Language Processing:
  Benchmarks, Baselines, and Building Blocks for Natural Language Policy
  Optimization
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
31
240
0
03 Oct 2022
Learning Non-Autoregressive Models from Search for Unsupervised Sentence
  Summarization
Learning Non-Autoregressive Models from Search for Unsupervised Sentence Summarization
Puyuan Liu
Chenyang Huang
Lili Mou
35
20
0
28 May 2022
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
301
1,620
0
18 Sep 2019
A Reinforcement Learning Approach to Interactive-Predictive Neural
  Machine Translation
A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation
Tsz Kin Lam
Julia Kreutzer
Stefan Riezler
24
31
0
03 May 2018
Improving a Neural Semantic Parser by Counterfactual Learning from Human
  Bandit Feedback
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback
Carolin (Haas) Lawrence
Stefan Riezler
OffRL
173
57
0
03 May 2018
Deep Reinforcement Learning for Dialogue Generation
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
220
1,328
0
05 Jun 2016
1