Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2203.12788
Cited By
Evaluating Distributional Distortion in Neural Language Modeling
International Conference on Learning Representations (ICLR), 2022
24 March 2022
Benjamin LeBrun
Alessandro Sordoni
Timothy J. O'Donnell
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Evaluating Distributional Distortion in Neural Language Modeling"
16 / 16 papers shown
Why Less is More (Sometimes): A Theory of Data Curation
Elvis Dohmatob
Mohammad Pezeshki
Reyhane Askari Hemmat
157
1
0
05 Nov 2025
FLAMES: Improving LLM Math Reasoning via a Fine-Grained Analysis of the Data Synthesis Pipeline
Parker Seegmiller
Kartik Mehta
Soumya Saha
Chenyang Tao
Shereen Oraby
Arpit Gupta
Tagyoung Chung
Mohit Bansal
Nanyun Peng
SyDa
LRM
104
0
0
22 Aug 2025
LLM as a Broken Telephone: Iterative Generation Distorts Information
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Amr Mohamed
Mingmeng Geng
Michalis Vazirgiannis
Guokan Shang
422
3
0
27 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
575
22
0
06 Feb 2025
Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory
International Conference on Learning Representations (ICLR), 2024
Aymane El Firdoussi
Abdalgader Abubaker
Soufiane Hayou
Réda Alami
Ahmed Alzubaidi
Hakim Hacid
349
6
0
11 Oct 2024
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement
Yunzhen Feng
Elvis Dohmatob
Pu Yang
Francois Charton
Julia Kempe
256
17
0
11 Jun 2024
ModelShield: Adaptive and Robust Watermark against Model Extraction Attack
IEEE Transactions on Information Forensics and Security (IEEE TIFS), 2024
Kaiyi Pang
Tao Qi
Chuhan Wu
Minhao Bai
Minghu Jiang
Yongfeng Huang
AAML
WaLM
569
9
0
03 May 2024
Predict the Next Word: Humans exhibit uncertainty in this task and language models _____
Evgenia Ilia
Wilker Aziz
280
3
0
27 Feb 2024
A Tale of Tails: Model Collapse as a Change of Scaling Laws
International Conference on Machine Learning (ICML), 2024
Elvis Dohmatob
Yunzhen Feng
Pu Yang
Francois Charton
Julia Kempe
321
107
0
10 Feb 2024
On Using Distribution-Based Compositionality Assessment to Evaluate Compositional Generalisation in Machine Translation
Anssi Moisio
Mathias Creutz
M. Kurimo
CoGe
234
1
0
14 Nov 2023
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling
International Conference on Learning Representations (ICLR), 2023
Siyu Ren
Zhiyong Wu
Kenny Q. Zhu
360
8
0
07 Oct 2023
Tailoring Language Generation Models under Total Variation Distance
International Conference on Learning Representations (ICLR), 2023
Haozhe Ji
Pei Ke
Zhipeng Hu
Rongsheng Zhang
Shiyu Huang
247
27
0
26 Feb 2023
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning
Neural Information Processing Systems (NeurIPS), 2023
Xinyi Wang
Wanrong Zhu
Michael Stephen Saxon
Mark Steyvers
William Yang Wang
BDL
543
163
0
27 Jan 2023
Neural-Symbolic Inference for Robust Autoregressive Graph Parsing via Compositional Uncertainty Quantification
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zi Lin
J. Liu
Jingbo Shang
UQLM
218
6
0
26 Jan 2023
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
International Conference on Learning Representations (ICLR), 2022
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
267
33
0
20 Sep 2022
How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN
R. Thomas McCoy
P. Smolensky
Tal Linzen
Jianfeng Gao
Asli Celikyilmaz
SyDa
235
161
0
18 Nov 2021
1
Page 1 of 1