ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.13751
62
4

Optimizing ML Training with Metagradient Descent

17 March 2025
Logan Engstrom
Andrew Ilyas
Benjamin Chen
Axel Feldmann
William Moses
Aleksander Madry
ArXivPDFHTML
Abstract

A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based approach to this problem. We first introduce an algorithm for efficiently calculating metagradients -- gradients through model training -- at scale. We then introduce a "smooth model training" framework that enables effective optimization using metagradients. With metagradient descent (MGD), we greatly improve on existing dataset selection methods, outperform accuracy-degrading data poisoning attacks by an order of magnitude, and automatically find competitive learning rate schedules.

View on arXiv
@article{engstrom2025_2503.13751,
  title={ Optimizing ML Training with Metagradient Descent },
  author={ Logan Engstrom and Andrew Ilyas and Benjamin Chen and Axel Feldmann and William Moses and Aleksander Madry },
  journal={arXiv preprint arXiv:2503.13751},
  year={ 2025 }
}
Comments on this paper