Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.11256
Cited By
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
17 June 2024
Tong Zhu
Daize Dong
Xiaoye Qu
Jiacheng Ruan
Wenliang Chen
Yu Cheng
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts"
4 / 4 papers shown
Title
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun
Disen Lan
Tong Zhu
Xiaoye Qu
Yu-Xi Cheng
MoE
55
1
0
07 Mar 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Chenghao Fan
Zhenyi Lu
Sichen Liu
Xiaoye Qu
Wei Wei
Chengfeng Gu
Yu-Xi Cheng
MoE
45
0
0
24 Feb 2025
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
Chenghao Fan
Zhenyi Lu
Wei Wei
Jie Tian
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
38
5
0
17 Jun 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1