ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.20126
66
0

FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute

27 February 2025
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
ArXivPDFHTML
Abstract

Despite their remarkable performance, modern Diffusion Transformers are hindered by substantial resource requirements during inference, stemming from the fixed and large amount of compute needed for each denoising step. In this work, we revisit the conventional static paradigm that allocates a fixed compute budget per denoising iteration and propose a dynamic strategy instead. Our simple and sample-efficient framework enables pre-trained DiT models to be converted into \emph{flexible} ones -- dubbed FlexiDiT -- allowing them to process inputs at varying compute budgets. We demonstrate how a single \emph{flexible} model can generate images without any drop in quality, while reducing the required FLOPs by more than 404040\% compared to their static counterparts, for both class-conditioned and text-conditioned image generation. Our method is general and agnostic to input and conditioning modalities. We show how our approach can be readily extended for video generation, where FlexiDiT models generate samples with up to 757575\% less compute without compromising performance.

View on arXiv
@article{anagnostidis2025_2502.20126,
  title={ FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute },
  author={ Sotiris Anagnostidis and Gregor Bachmann and Yeongmin Kim and Jonas Kohler and Markos Georgopoulos and Artsiom Sanakoyeu and Yuming Du and Albert Pumarola and Ali Thabet and Edgar Schönfeld },
  journal={arXiv preprint arXiv:2502.20126},
  year={ 2025 }
}
Comments on this paper