Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.08844
Cited By
RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
15 May 2023
Afra Feyza Akyürek
Ekin Akyürek
Aman Madaan
A. Kalyan
Peter Clark
Derry Wijaya
Niket Tandon
ALM
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs"
19 / 19 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
60
0
0
05 May 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
31
0
0
05 May 2025
CoLa -- Learning to Interactively Collaborate with Large LMs
Abhishek Sharma
Dan Goldwasser
LLMAG
SyDa
58
0
0
03 Apr 2025
LLMs Can Generate a Better Answer by Aggregating Their Own Responses
Zichong Li
Xinyu Feng
Yuheng Cai
Zixuan Zhang
Tianyi Liu
Chen Liang
Weizhu Chen
Haoyu Wang
T. Zhao
LRM
50
1
0
06 Mar 2025
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots
Dongge Han
Trevor A. McInroe
Adam Jelley
Stefano V. Albrecht
Peter Bell
Amos Storkey
46
9
0
31 Dec 2024
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang
Yufei Wang
Tiezheng YU
Yuxin Jiang
Chuhan Wu
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Fuyuan Lyu
Chen Ma
26
4
0
07 Oct 2024
TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking
Danqing Wang
Jianxin Ma
Fei Fang
Lei Li
LLMAG
LRM
53
0
0
02 Oct 2024
Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues
Yuncheng Hua
Lizhen Qu
Gholamreza Haffari
85
4
0
29 Jan 2024
The Critique of Critique
Shichao Sun
Junlong Li
Weizhe Yuan
Ruifeng Yuan
Wenjie Li
Pengfei Liu
ELM
24
0
0
09 Jan 2024
Can LLMs Patch Security Issues?
Kamel Alrashedy
Abdullah Aljasser
Pradyumna Tambwekar
Matthew Gombolay
AAML
16
6
0
13 Nov 2023
Constructive Large Language Models Alignment with Diverse Feedback
Tianshu Yu
Ting-En Lin
Yuchuan Wu
Min Yang
Fei Huang
Yongbin Li
ALM
30
8
0
10 Oct 2023
Improving Summarization with Human Edits
Zonghai Yao
Benjamin J Schloss
Sai P. Selvaraj
22
2
0
09 Oct 2023
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Patrick Fernandes
Aman Madaan
Emmy Liu
António Farinhas
Pedro Henrique Martins
...
José G. C. de Souza
Shuyan Zhou
Tongshuang Wu
Graham Neubig
André F. T. Martins
ALM
106
56
0
01 May 2023
When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels
Weiyan Shi
Emily Dinan
Kurt Shuster
Jason Weston
Jing Xu
44
19
0
28 Oct 2022
Learning to Model Editing Processes
Machel Reid
Graham Neubig
KELM
BDL
101
34
0
24 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Fast Model Editing at Scale
E. Mitchell
Charles Lin
Antoine Bosselut
Chelsea Finn
Christopher D. Manning
KELM
219
341
0
21 Oct 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
254
374
0
28 Feb 2021
Text Editing by Command
Felix Faltings
Michel Galley
Gerold Hintz
Chris Brockett
Chris Quirk
Jianfeng Gao
Bill Dolan
KELM
132
36
0
24 Oct 2020
1