Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.15708
Cited By
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
24 November 2024
Xiaoye Qu
Daize Dong
Xuyang Hu
Tong Zhu
Weigao Sun
Yu-Xi Cheng
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training"
5 / 5 papers shown
Title
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
80
1
0
10 Mar 2025
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun
Disen Lan
Tong Zhu
Xiaoye Qu
Yu-Xi Cheng
MoE
58
1
0
07 Mar 2025
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Disen Lan
Weigao Sun
Jiaxi Hu
Jusen Du
Yu-Xi Cheng
64
0
0
03 Mar 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Chenghao Fan
Zhenyi Lu
Sichen Liu
Xiaoye Qu
Wei Wei
Chengfeng Gu
Yu-Xi Cheng
MoE
47
0
0
24 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu-Xi Cheng
KELM
75
3
0
19 Feb 2025
1