CompeteSMoE -- Effective Training of Sparse Mixture of Experts via
Competition

CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition

4 February 2024

Quang-Cuong Pham

Huy Nguyen

TrungTin Nguyen

Savitha Ramasamy

Steven C. H. Hoi

Papers citing "CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition"

9 / 9 papers shown

Title
Tight Clusters Make Specialized Experts Stefan K. Nielsen R. Teo Laziz U. Abdullaev Tan M. Nguyen MoE 56 2 0 21 Feb 2025
Quadratic Gating Functions in Mixture of Experts: A Statistical Insight Pedram Akbarian Huy Le Nguyen Xing Han Nhat Ho MoE 32 0 0 15 Oct 2024
Layerwise Recurrent Router for Mixture-of-Experts Zihan Qiu Zeyu Huang Shuang Cheng Yizhi Zhou Zili Wang Ivan Titov Jie Fu MoE 73 2 0 13 Aug 2024
FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion Xing Han Huy Nguyen Carl Harris Nhat Ho S. Saria MoE 69 16 0 05 Feb 2024
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT James Lee-Thorp Joshua Ainslie MoE 30 11 0 24 May 2022
Mixture-of-Experts with Expert Choice Routing Yan-Quan Zhou Tao Lei Han-Chu Liu Nan Du Yanping Huang Vincent Zhao Andrew M. Dai Zhifeng Chen Quoc V. Le James Laudon MoE 149 327 0 18 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong S. Hoi MLLM BDL VLM CLIP 390 4,124 0 28 Jan 2022
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation Yue Wang Weishi Wang Shafiq R. Joty S. Hoi 210 1,489 0 02 Sep 2021
Efficient Intent Detection with Dual Sentence Encoders I. Casanueva Tadas Temvcinas D. Gerz Matthew Henderson Ivan Vulić VLM 178 451 0 10 Mar 2020