Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.13385
Cited By
Tuna: Instruction Tuning using Feedback from Large Language Models
20 October 2023
Haoran Li
Yiran Liu
Xingxing Zhang
Wei Lu
Furu Wei
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tuna: Instruction Tuning using Feedback from Large Language Models"
4 / 4 papers shown
Title
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
39
8
0
17 Jun 2024
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Classical Structured Prediction Losses for Sequence to Sequence Learning
Sergey Edunov
Myle Ott
Michael Auli
David Grangier
MarcÁurelio Ranzato
AIMat
40
185
0
14 Nov 2017
1