Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.01016
Cited By
MoE-I
2
^2
2
: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
1 November 2024
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
Yuanlin Duan
Wenqi Jia
Miao Yin
Yu Cheng
Bo Yuan
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition"
1 / 1 papers shown
Title
Faster MoE LLM Inference for Extremely Large Models
Haoqi Yang
Luohe Shi
Qiwei Li
Zuchao Li
Ping Wang
Bo Du
Mengjia Shen
Hai Zhao
MoE
59
0
0
06 May 2025
1