Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.09807
Cited By
The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text
16 November 2023
Yanzhu Guo
Guokan Shang
Michalis Vazirgiannis
Chloé Clavel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text"
10 / 10 papers shown
Title
Aligning Instruction Tuning with Pre-training
Yiming Liang
Tianyu Zheng
Xinrun Du
Ge Zhang
J. Liu
...
Zhaoxiang Zhang
Wenhao Huang
Jiajun Zhang
Xiang Yue
Jiajun Zhang
81
1
0
16 Jan 2025
Collapse or Thrive? Perils and Promises of Synthetic Data in a Self-Generating World
Joshua Kazdan
Rylan Schaeffer
Apratim Dey
Matthias Gerstgrasser
Rafael Rafailov
D. Donoho
Sanmi Koyejo
45
11
0
22 Oct 2024
Expanding Chatbot Knowledge in Customer Service: Context-Aware Similar Question Generation Using Large Language Models
Mengze Hong
Yuanfeng Song
Di Jiang
Lu Wang
Zichang Guo
Yuanqin He
Zhiyang Su
Qing Li
35
1
0
16 Oct 2024
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRM
LM&Ro
43
21
0
22 Apr 2024
Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores
Chantal Shaib
Joe Barrow
Jiuding Sun
Alexa F. Siu
Byron C. Wallace
A. Nenkova
66
31
0
01 Mar 2024
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Data Augmentation Approaches in Natural Language Processing: A Survey
Bohan Li
Yutai Hou
Wanxiang Che
113
269
0
05 Oct 2021
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
Peng Qi
Yuhao Zhang
Yuhui Zhang
Jason Bolton
Christopher D. Manning
AI4TS
184
1,638
0
16 Mar 2020
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
170
3,504
0
10 Jun 2015
1