Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.12473
Cited By
Continually Improving Extractive QA via Human Feedback
21 May 2023
Ge Gao
Hung-Ting Chen
Yoav Artzi
Eunsol Choi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Continually Improving Extractive QA via Human Feedback"
12 / 12 papers shown
Title
DRS: Deep Question Reformulation With Structured Output
Zhecheng Li
Y. Wang
Bryan Hooi
Yujun Cai
Nanyun Peng
Kai-Wei Chang
KELM
71
0
0
27 Nov 2024
Retrospective Learning from Interactions
Zizhao Chen
Mustafa Omer Gul
Yiwei Chen
Gloria Geng
Anne Wu
Yoav Artzi
LRM
21
1
0
17 Oct 2024
CoGen: Learning from Feedback with Coupled Comprehension and Generation
Mustafa Omer Gul
Yoav Artzi
23
3
0
28 Aug 2024
An Empirical Study on Self-correcting Large Language Models for Data Science Code Generation
Thai Tang Quoc
Duc Ha Minh
Tho Quan Thanh
Anh Nguyen-Duc
LRM
13
1
0
28 Aug 2024
I Could've Asked That: Reformulating Unanswerable Questions
Wenting Zhao
Ge Gao
Claire Cardie
Alexander M. Rush
ELM
17
1
0
24 Jul 2024
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs
Shreyas Chaudhari
Pranjal Aggarwal
Vishvak Murahari
Tanmay Rajpurohit
A. Kalyan
Karthik Narasimhan
A. Deshpande
Bruno Castro da Silva
21
34
0
12 Apr 2024
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies
Liangming Pan
Michael Stephen Saxon
Wenda Xu
Deepak Nathani
Xinyi Wang
William Yang Wang
KELM
LRM
31
201
0
06 Aug 2023
Investigating Table-to-Text Generation Capabilities of LLMs in Real-World Information Seeking Scenarios
Yilun Zhao
Haowei Zhang
Shengyun Si
Linyong Nan
Xiangru Tang
Arman Cohan
LMTD
17
12
0
24 May 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
329
1,944
0
04 May 2020
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,583
0
18 Sep 2019
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback
Carolin (Haas) Lawrence
Stefan Riezler
OffRL
171
56
0
03 May 2018
1