v1v2 (latest)

Scaling laws for learning with real and surrogate data

6 February 2024

Papers citing "Scaling laws for learning with real and surrogate data"

17 / 17 papers shown

Data Value in the Age of Scaling: Understanding LLM Scaling Dynamics Under Real-Synthetic Data Mixtures

...

154

17 Nov 2025

Optimal Regularization for Performative Learning

Edwige Cyffers

Alireza Mirrokni

Marco Mondelli

105

14 Oct 2025

Beyond Real Data: Synthetic Data through the Lens of Regularization

219

09 Oct 2025

High-dimensional Analysis of Synthetic Data Selection

168

09 Oct 2025

Filtering with Confidence: When Data Augmentation Meets Conformal Prediction

149

25 Sep 2025

When Models Don't Collapse: On the Consistency of Iterative MLE

Daniel Barzilai

Ohad Shamir

SyDa

186

25 May 2025

A Multi-Power Law for Loss Curve Prediction Across Learning Rate SchedulesInternational Conference on Learning Representations (ICLR), 2025

289

17 Mar 2025

MixMin: Finding Data Mixtures via Convex Minimization

314

14 Feb 2025

Rate of Model Collapse in Recursive TrainingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

A. Suresh

A. Thangaraj

Aditya Nanda Kishore Khandavally

SyDa

207

23 Dec 2024

Loss-to-Loss Prediction: Scaling Laws for All Datasets

289

19 Nov 2024

Universality of the

π^2/6

Pathway in Avoiding Model Collapse

Apratim Dey

D. Donoho

300

30 Oct 2024

High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling LawsInternational Conference on Learning Representations (ICLR), 2024

M. E. Ildiz

Halil Alperen Gozeten

Ege Onur Taga

Marco Mondelli

Samet Oymak

497

24 Oct 2024

Collapse or Thrive? Perils and Promises of Synthetic Data in a Self-Generating World

Joshua Kazdan

Rylan Schaeffer

Apratim Dey

Matthias Gerstgrasser

Rafael Rafailov

D. Donoho

Sanmi Koyejo

612

22 Oct 2024

Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data SpectraInternational Conference on Learning Representations (ICLR), 2024

Roman Worschech

B. Rosenow

355

11 Oct 2024

Strong Model CollapseInternational Conference on Learning Representations (ICLR), 2024

Elvis Dohmatob

Yunzhen Feng

Arjun Subramonian

Julia Kempe

277

07 Oct 2024

Scaling Laws in Linear Regression: Compute, Parameters, and Data

471

12 Jun 2024

Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance

Jiasheng Ye

Peiju Liu

Tianxiang Sun

Yunhua Zhou

Jun Zhan

Xipeng Qiu

379

107

25 Mar 2024