Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.13233
Cited By
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models
19 June 2024
Zihao Zeng
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models"
6 / 6 papers shown
Title
Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Peizhuang Cong
Wenpu Liu
Wenhan Yu
Haochen Zhao
Tong Yang
ALM
MoE
74
0
0
06 Feb 2025
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
Yuxuan Zhang
Ruizhe Li
MoMe
53
0
0
02 Oct 2024
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
209
1,105
0
20 Sep 2022
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
149
326
0
18 Feb 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,843
0
18 Apr 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
1