ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.16057
35
0

Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts

20 March 2025
Yike Yuan
Ziyu Wang
Zihao Huang
Defa Zhu
Xun Zhou
Jingyi Yu
Qiyang Min
    DiffM
    MoE
ArXivPDFHTML
Abstract

Diffusion models have emerged as mainstream framework in visual generation. Building upon this success, the integration of Mixture of Experts (MoE) methods has shown promise in enhancing model scalability and performance. In this paper, we introduce Race-DiT, a novel MoE model for diffusion transformers with a flexible routing strategy, Expert Race. By allowing tokens and experts to compete together and select the top candidates, the model learns to dynamically assign experts to critical tokens. Additionally, we propose per-layer regularization to address challenges in shallow layer learning, and router similarity loss to prevent mode collapse, ensuring better expert utilization. Extensive experiments on ImageNet validate the effectiveness of our approach, showcasing significant performance gains while promising scaling properties.

View on arXiv
@article{yuan2025_2503.16057,
  title={ Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts },
  author={ Yike Yuan and Ziyu Wang and Zihao Huang and Defa Zhu and Xun Zhou and Jingyi Yu and Qiyang Min },
  journal={arXiv preprint arXiv:2503.16057},
  year={ 2025 }
}
Comments on this paper