ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.08415
22
0

Improved Scaling Laws in Linear Regression via Data Reuse

10 June 2025
Licong Lin
Jingfeng Wu
Peter Bartlett
ArXiv (abs)PDFHTML
Abstract

Neural scaling laws suggest that the test error of large language models trained online decreases polynomially as the model size and data size increase. However, such scaling can be unsustainable when running out of new data. In this work, we show that data reuse can improve existing scaling laws in linear regression. Specifically, we derive sharp test error bounds on MMM-dimensional linear models trained by multi-pass stochastic gradient descent (multi-pass SGD) on NNN data with sketched features. Assuming that the data covariance has a power-law spectrum of degree aaa, and that the true parameter follows a prior with an aligned power-law spectrum of degree b−ab-ab−a (with a>b>1a > b > 1a>b>1), we show that multi-pass SGD achieves a test error of Θ(M1−b+L(1−b)/a)\Theta(M^{1-b} + L^{(1-b)/a})Θ(M1−b+L(1−b)/a), where L≲Na/bL \lesssim N^{a/b}L≲Na/b is the number of iterations. In the same setting, one-pass SGD only attains a test error of Θ(M1−b+N(1−b)/a)\Theta(M^{1-b} + N^{(1-b)/a})Θ(M1−b+N(1−b)/a) (see e.g., Lin et al., 2024). This suggests an improved scaling law via data reuse (i.e., choosing L>NL>NL>N) in data-constrained regimes. Numerical simulations are also provided to verify our theoretical findings.

View on arXiv
@article{lin2025_2506.08415,
  title={ Improved Scaling Laws in Linear Regression via Data Reuse },
  author={ Licong Lin and Jingfeng Wu and Peter L. Bartlett },
  journal={arXiv preprint arXiv:2506.08415},
  year={ 2025 }
}
Comments on this paper