ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.01335
  4. Cited By
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
  Models

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

2 January 2024
Zixiang Chen
Yihe Deng
Huizhuo Yuan
Kaixuan Ji
Quanquan Gu
    SyDa
ArXivPDFHTML

Papers citing "Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models"

6 / 56 papers shown
Title
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and
  the Case of Information Extraction
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction
Martin Josifoski
Marija Sakota
Maxime Peyrard
Robert West
SyDa
54
76
0
07 Mar 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
Curriculum Learning: A Survey
Curriculum Learning: A Survey
Petru Soviany
Radu Tudor Ionescu
Paolo Rota
N. Sebe
ODL
63
251
0
25 Jan 2021
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
273
1,561
0
18 Sep 2019
Previous
12