ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.18590
24
0

A multilevel approach to accelerate the training of Transformers

24 April 2025
Guillaume Lauga
Maël Chaumette
Edgar Desainte-Maréville
Étienne Lasalle
Arthur Lebeurrier
    AI4CE
ArXivPDFHTML
Abstract

In this article, we investigate the potential of multilevel approaches to accelerate the training of transformer architectures. Using an ordinary differential equation (ODE) interpretation of these architectures, we propose an appropriate way of varying the discretization of these ODE Transformers in order to accelerate the training. We validate our approach experimentally by a comparison with the standard training procedure.

View on arXiv
@article{lauga2025_2504.18590,
  title={ A multilevel approach to accelerate the training of Transformers },
  author={ Guillaume Lauga and Maël Chaumette and Edgar Desainte-Maréville and Étienne Lasalle and Arthur Lebeurrier },
  journal={arXiv preprint arXiv:2504.18590},
  year={ 2025 }
}
Comments on this paper