Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2011.02511
Cited By
Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks
4 November 2020
Julia Kreutzer
Stefan Riezler
Carolin (Haas) Lawrence
RALM
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks"
8 / 8 papers shown
Title
Participation in the age of foundation models
Harini Suresh
Emily Tseng
Meg Young
Mary L. Gray
Emma Pierson
Karen Levy
46
20
0
29 May 2024
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
53
10
0
28 Aug 2023
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
31
240
0
03 Oct 2022
Learning Non-Autoregressive Models from Search for Unsupervised Sentence Summarization
Puyuan Liu
Chenyang Huang
Lili Mou
35
20
0
28 May 2022
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
301
1,620
0
18 Sep 2019
A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation
Tsz Kin Lam
Julia Kreutzer
Stefan Riezler
24
31
0
03 May 2018
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback
Carolin (Haas) Lawrence
Stefan Riezler
OffRL
173
57
0
03 May 2018
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
220
1,328
0
05 Jun 2016
1