LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

24 November 2024

Papers citing "LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training"

5 / 5 papers shown

Title
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications Siyuan Mu Sen Lin MoE 80 1 0 10 Mar 2025
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts Weigao Sun Disen Lan Tong Zhu Xiaoye Qu Yu-Xi Cheng MoE 58 1 0 07 Mar 2025
Liger: Linearizing Large Language Models to Gated Recurrent Structures Disen Lan Weigao Sun Jiaxi Hu Jusen Du Yu-Xi Cheng 64 0 0 03 Mar 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Chenghao Fan Zhenyi Lu Sichen Liu Xiaoye Qu Wei Wei Chengfeng Gu Yu-Xi Cheng MoE 47 0 0 24 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories Jusen Du Weigao Sun Disen Lan Jiaxi Hu Yu-Xi Cheng KELM 75 3 0 19 Feb 2025