Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2204.09179
Cited By
v1
v2
v3 (latest)
On the Representation Collapse of Sparse Mixture of Experts
Neural Information Processing Systems (NeurIPS), 2022
20 April 2022
Zewen Chi
Li Dong
Shaohan Huang
Damai Dai
Shuming Ma
Barun Patra
Saksham Singhal
Payal Bajaj
Xia Song
Xian-Ling Mao
Heyan Huang
Furu Wei
MoMe
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"On the Representation Collapse of Sparse Mixture of Experts"
50 / 88 papers shown
Title
Mosaic Pruning: A Hierarchical Framework for Generalizable Pruning of Mixture-of-Experts Models
Wentao Hu
Mingkuan Zhao
Shuangyong Song
Xiaoyan Zhu
Xin Lai
Jiayin Wang
83
1
0
25 Nov 2025
Selective Sinkhorn Routing for Improved Sparse Mixture of Experts
Duc Nguyen
Huu Binh Ta
Nhuan Le Duc
T. Nguyen
T. Tran
MoE
289
0
0
12 Nov 2025
Input Domain Aware MoE: Decoupling Routing Decisions from Task Optimization in Mixture of Experts
Yongxiang Hua
H. Cao
Zhou Tao
Bocheng Li
Zihao Wu
Chaohu Liu
Linli Xu
MoE
148
0
0
18 Oct 2025
Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
Minh Khoi Nguyen Nhat
R. Teo
Laziz U. Abdullaev
Maurice Mok
Viet-Hoang Tran
T. Nguyen
MoE
150
0
0
18 Oct 2025
MC#: Mixture Compressor for Mixture-of-Experts Large Models
Wei Huang
Yue Liao
Yukang Chen
Jianhui Liu
Haoru Tan
Si Liu
Shiming Zhang
Shuicheng Yan
Xiaojuan Qi
MoE
MQ
180
0
0
13 Oct 2025
DiEP: Adaptive Mixture-of-Experts Compression through Differentiable Expert Pruning
Sikai Bai
Haoxi Li
Jie Zhang
Zicong Hong
Song Guo
MoE
70
1
0
19 Sep 2025
Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
Yixiao Zhou
Ziyu Zhao
Dongzhou Cheng
Zhiliang Wu
Jie Gui
Yi-feng Yang
Fei Wu
Yu Cheng
Hehe Fan
MoMe
MoE
131
1
0
12 Sep 2025
Router Upcycling: Leveraging Mixture-of-Routers in Mixture-of-Experts Upcycling
Junfeng Ran
Guangxiang Zhao
Yuhan Wu
Dawei Zhu
Longyun Wu
Yikai Zhao
Tong Yang
Lin Sun
Xiangzheng Zhang
Sujian Li
MoE
MoMe
76
0
0
31 Aug 2025
HierMoE: Accelerating MoE Training with Hierarchical Token Deduplication and Expert Swap
Wenxiang Lin
Xinglin Pan
Lin Zhang
Shaohuai Shi
Xuan Wang
Xiaowen Chu
MoE
84
0
0
13 Aug 2025
Unveiling Super Experts in Mixture-of-Experts Large Language Models
Zunhai Su
Qingyuan Li
Hao Zhang
Weihao Ye
Qibo Xue
YuLei Qian
Yuchen Xie
Ngai Wong
Kehong Yuan
MoE
250
2
0
31 Jul 2025
MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models
Jie Cao
Tianwei Lin
Hongyang He
Rolan Yan
Wenqiao Zhang
Juncheng Billy Li
D. Zhang
Siliang Tang
Yueting Zhuang
MoE
163
0
0
06 Jun 2025
Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
James Oldfield
Shawn Im
Yixuan Li
M. Nicolaou
Ioannis Patras
Grigorios G. Chrysos
MoE
264
0
0
27 May 2025
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models
Hao Kang
Zichun Yu
Chenyan Xiong
MoE
219
2
0
26 May 2025
EvidenceMoE: A Physics-Guided Mixture-of-Experts with Evidential Critics for Advancing Fluorescence Light Detection and Ranging in Scattering Media
Ismail Erbas
Ferhat Demirkiran
Karthik Swaminathan
Naigang Wang
Navid Ibtehaj Nizam
Stefan T. Radev
Kaoutar El Maghraoui
Xavier Intes
Vikas Pandey
MoE
184
1
0
23 May 2025
Generalizable Multispectral Land Cover Classification via Frequency-Aware Mixture of Low-Rank Token Experts
Xi Chen
Shen Yan
Juelin Zhu
Chen Chen
Yu Liu
Maojun Zhang
205
1
0
20 May 2025
Improving Routing in Sparse Mixture of Experts with Graph of Tokens
Tam Minh Nguyen
Ngoc N. Tran
Khai Nguyen
Richard G. Baraniuk
MoE
214
2
0
01 May 2025
Mixture-of-Experts for Distributed Edge Computing with Channel-Aware Gating Function
Qiuchen Song
Shusen Jing
Shuai Zhang
Songyang Zhang
Chuan Huang
MoE
239
3
0
01 Apr 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
945
32
0
10 Mar 2025
CAMEx: Curvature-aware Merging of Experts
International Conference on Learning Representations (ICLR), 2025
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
307
6
0
26 Feb 2025
Tight Clusters Make Specialized Experts
International Conference on Learning Representations (ICLR), 2025
Stefan K. Nielsen
R. Teo
Laziz U. Abdullaev
Tan M. Nguyen
MoE
381
6
0
21 Feb 2025
Theory on Mixture-of-Experts in Continual Learning
International Conference on Learning Representations (ICLR), 2024
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
518
34
0
20 Feb 2025
Importance Sampling via Score-based Generative Models
Heasung Kim
Taekyun Lee
Hyeji Kim
Gustavo de Veciana
MedIm
DiffM
290
10
0
07 Feb 2025
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
International Conference on Learning Representations (ICLR), 2024
Moritz Reuss
Jyothish Pari
Pulkit Agrawal
Rudolf Lioutikov
DiffM
MoE
244
24
0
17 Dec 2024
MoSLD: An Extremely Parameter-Efficient Mixture-of-Shared LoRAs for Multi-Task Learning
International Conference on Computational Linguistics (COLING), 2024
Lulu Zhao
Weihao Zeng
Xiaofeng Shi
Hua Zhou
MoMe
MoE
242
3
0
12 Dec 2024
MH-MoE: Multi-Head Mixture-of-Experts
Shaohan Huang
Xun Wu
Shuming Ma
Furu Wei
MoE
329
4
0
25 Nov 2024
Facet-Aware Multi-Head Mixture-of-Experts Model for Sequential Recommendation
Web Search and Data Mining (WSDM), 2024
Mingrui Liu
Sixiao Zhang
Cheng Long
263
4
0
03 Nov 2024
MoE-I
2
^2
2
: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
Yuanlin Duan
Wenqi Jia
Miao Yin
Yu Cheng
Bo Yuan
MoE
359
22
0
01 Nov 2024
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models
Nam V. Nguyen
Thong T. Doan
Luong Tran
Van Nguyen
Quang Pham
MoE
532
4
0
01 Nov 2024
Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Xin Zhou
Ping Nie
Yiwen Guo
Haojie Wei
Zhanqiu Zhang
Pasquale Minervini
Ruotian Ma
Tao Gui
Tao Gui
Xuanjing Huang
MoE
146
1
0
20 Oct 2024
MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts
Neural Information Processing Systems (NeurIPS), 2024
R. Teo
Tan M. Nguyen
MoE
183
6
0
18 Oct 2024
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router
Yanyue Xie
Zhi Zhang
Ding Zhou
Cong Xie
Ziang Song
Xin Liu
Yanzhi Wang
Xue Lin
An Xu
LLMAG
190
24
0
15 Oct 2024
Upcycling Large Language Models into Mixture of Experts
Ethan He
Syeda Nahida Akter
R. Prenger
V. Korthikanti
Zijie Yan
Tong Liu
Shiqing Fan
Ashwath Aithal
Mohammad Shoeybi
Bryan Catanzaro
MoE
337
32
0
10 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
International Conference on Learning Representations (ICLR), 2024
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
240
21
0
08 Oct 2024
Exploring the Benefit of Activation Sparsity in Pre-training
International Conference on Machine Learning (ICML), 2024
Zhengyan Zhang
Chaojun Xiao
Qiujieli Qin
Yankai Lin
Zhiyuan Zeng
Xu Han
Zhiyuan Liu
Ruobing Xie
Maosong Sun
Jie Zhou
MoE
195
6
0
04 Oct 2024
DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models
Maryam Akhavan Aghdam
Hongpeng Jin
Yanzhao Wu
MoE
170
7
0
10 Sep 2024
HMoE: Heterogeneous Mixture of Experts for Language Modeling
An Wang
Xingwu Sun
Ruobing Xie
Shuaipeng Li
Jiaqi Zhu
...
J. N. Han
Zhanhui Kang
Di Wang
Naoaki Okazaki
Cheng-zhong Xu
MoE
212
25
0
20 Aug 2024
Layerwise Recurrent Router for Mixture-of-Experts
International Conference on Learning Representations (ICLR), 2024
Zihan Qiu
Zeyu Huang
Shuang Cheng
Yizhi Zhou
Zili Wang
Ivan Titov
Jie Fu
MoE
274
7
0
13 Aug 2024
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
371
77
0
20 Jul 2024
Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models
Yongji Wu
Wenjie Qu
Xueshen Liu
Tianyang Tao
Wei Bai
...
Jiaheng Zhang
Z. Morley Mao
Matthew Lentz
Danyang Zhuo
Ion Stoica
231
6
0
05 Jul 2024
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
Yixiao Wang
Yifei Zhang
Mingxiao Huo
Ran Tian
Xiang Zhang
...
Chenfeng Xu
Pengliang Ji
Wei Zhan
Mingyu Ding
Masayoshi Tomizuka
MoE
270
42
0
01 Jul 2024
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Enshu Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Matthew B. Blaschko
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MoE
191
21
0
01 Jul 2024
A Closer Look into Mixture-of-Experts in Large Language Models
Ka Man Lo
Zeyu Huang
Zihan Qiu
Zili Wang
Jie Fu
MoE
382
22
0
26 Jun 2024
A Survey on Mixture of Experts in Large Language Models
Weilin Cai
Juyong Jiang
Fan Wang
Jing Tang
Sunghun Kim
Jiayi Huang
MoE
370
70
0
26 Jun 2024
SimSMoE: Solving Representational Collapse via Similarity Measure
Giang Do
Hung Le
T. Tran
MoE
236
3
0
22 Jun 2024
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
Haoze Wu
Zihan Qiu
Zili Wang
Hang Zhao
Jie Fu
MoE
156
7
0
18 Jun 2024
Graph Knowledge Distillation to Mixture of Experts
Pavel Rumiantsev
Mark Coates
167
0
0
17 Jun 2024
Flextron: Many-in-One Flexible Large Language Model
Ruisi Cai
Saurav Muralidharan
Greg Heinrich
Hongxu Yin
Zhangyang Wang
Jan Kautz
Pavlo Molchanov
176
29
0
11 Jun 2024
Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision
Minglei Li
Peng Ye
Yongqi Huang
Lin Zhang
Tao Chen
Tong He
Jiayuan Fan
Wanli Ouyang
MoE
291
11
0
05 Jun 2024
Training-efficient density quantum machine learning
Brian Coyle
El Amine Cherrat
Nishant Jain
Natansh Mathur
Snehal Raj
Skander Kazdaghli
Iordanis Kerenidis
309
9
0
30 May 2024
Ensembling Diffusion Models via Adaptive Feature Aggregation
Cong Wang
Kuan Tian
Yonghang Guan
Jun Zhang
Zhiwei Jiang
Fei Shen
Xiao Han
267
15
0
27 May 2024
1
2
Next