
Title |
|---|
![]() CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-ExpertsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 |
![]() Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer ModelsInternational Conference on Learning Representations (ICLR), 2024 |