A Multi-Power Law for Loss Curve Prediction Across Learning Rate SchedulesInternational Conference on Learning Representations (ICLR), 2025 |
Rate of Model Collapse in Recursive TrainingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024 |
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling LawsInternational Conference on Learning Representations (ICLR), 2024 |
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data
SpectraInternational Conference on Learning Representations (ICLR), 2024 |
Strong Model CollapseInternational Conference on Learning Representations (ICLR), 2024 |