Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.05444
Cited By
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
11 September 2023
Ted Zadouri
A. Ustun
Arash Ahmadian
Beyza Ermics
Acyr F. Locatelli
Sara Hooker
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning"
50 / 80 papers shown
Title
A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning
Junzhou Xu
Boyu Diao
MoE
37
0
0
06 May 2025
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Song Wang
Xiaolu Liu
Lingdong Kong
Jianyun Xu
Chunyong Hu
Gongfan Fang
Wentong Li
Jianke Zhu
Xinchao Wang
22
0
0
22 Apr 2025
LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts
Yimu Wang
Mozhgan Nasr Azadani
Sean Sedwards
Krzysztof Czarnecki
MLLM
MoE
52
0
0
07 Apr 2025
Investigating and Mitigating Stereotype-aware Unfairness in LLM-based Recommendations
Zihuai Zhao
Wenqi Fan
Yao Wu
Qing Li
75
1
0
05 Apr 2025
MetaLoRA: Tensor-Enhanced Adaptive Low-Rank Fine-tuning
Maolin Wang
Xiangyu Zhao
AI4CE
41
0
0
01 Apr 2025
Mixture of Routers
Jia-Chen Zhang
Yu-Jie Xiong
Xi-He Qiu
Chun-Ming Xia
Fei Dai
MoE
59
0
0
30 Mar 2025
Efficient Model Development through Fine-tuning Transfer
Pin-Jie Lin
Rishab Balasubramanian
Fengyuan Liu
Nikhil Kandpal
Tu Vu
59
0
0
25 Mar 2025
Merge then Realign: Simple and Effective Modality-Incremental Continual Learning for Multimodal LLMs
Dingkun Zhang
Shuhan Qi
Xinyu Xiao
Kehai Chen
Xuan Wang
CLL
MoMe
59
0
0
08 Mar 2025
Multi-Level Collaboration in Model Merging
Qi Li
Runpeng Yu
Xinchao Wang
MoMe
FedML
91
0
0
03 Mar 2025
Sample Selection via Contrastive Fragmentation for Noisy Label Regression
C. Kim
Sangwoo Moon
Jihwan Moon
Dongyeon Woo
Gunhee Kim
NoLa
52
0
0
25 Feb 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Chenghao Fan
Zhenyi Lu
Sichen Liu
Xiaoye Qu
Wei Wei
Chengfeng Gu
Yu-Xi Cheng
MoE
91
0
0
24 Feb 2025
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Zhili Liu
Yunhao Gou
Kai Chen
Lanqing Hong
Jiahui Gao
...
Yu Zhang
Zhenguo Li
Xin Jiang
Q. Liu
James T. Kwok
MoE
96
9
0
20 Feb 2025
A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models
Mengyang Sun
Yihao Wang
Tao Feng
Dan Zhang
Yifan Zhu
J. Tang
MoE
29
0
0
20 Feb 2025
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
151
14
0
20 Feb 2025
Ensembles of Low-Rank Expert Adapters
Yinghao Li
Vianne Gao
Chao Zhang
MohamadAli Torkamani
60
0
0
31 Jan 2025
Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning
Ziyu Zhao
Yixiao Zhou
Didi Zhu
Tao Shen
X. Wang
Jing Su
Kun Kuang
Zhongyu Wei
Fei Wu
Yu Cheng
MoE
26
1
0
28 Jan 2025
GraphLoRA: Empowering LLMs Fine-Tuning via Graph Collaboration of MoE
Ting Bai
Yue Yu
Le Huang
Zenan Xu
Zhe Zhao
Chuan Shi
MoE
123
0
0
18 Dec 2024
Investigating Mixture of Experts in Dense Retrieval
Effrosyni Sokli
Pranav Kasela
Georgios Peikos
G. Pasi
MoE
67
1
0
16 Dec 2024
MoSLD: An Extremely Parameter-Efficient Mixture-of-Shared LoRAs for Multi-Task Learning
Lulu Zhao
Weihao Zeng
Xiaofeng Shi
Hua Zhou
MoMe
MoE
67
0
0
12 Dec 2024
MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning
Yufei Ma
Zihan Liang
Huangyu Dai
B. Chen
D. Gao
...
Linbo Jin
Wen Jiang
Guannan Zhang
Xiaoyan Cai
Libin Yang
MoE
MoMe
94
1
0
10 Dec 2024
PMoL: Parameter Efficient MoE for Preference Mixing of LLM Alignment
Dongxu Liu
Bing Xu
Yinzhuo Chen
Bufan Xu
Wenpeng Lu
Muyun Yang
T. Zhao
MoE
39
1
0
02 Nov 2024
MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Xujia Wang
Haiyan Zhao
Shuo Wang
Hanqing Wang
Zhiyuan Liu
MoMe
MoE
30
0
0
30 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
L. Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
44
3
0
24 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Mohit Bansal
Tianlong Chen
MoMe
MoE
25
1
0
09 Oct 2024
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards
Sheng Wang
Liheng Chen
Pengan Chen
Jingwei Dong
Boyang Xue
Jiyue Jiang
Lingpeng Kong
Chuan Wu
MoE
29
7
0
01 Oct 2024
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE
Xun Zhu
Ying Hu
Fanbin Mo
Miao Li
Ji Wu
44
8
0
26 Sep 2024
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
Dongyang Fan
Bettina Messmer
N. Doikov
Martin Jaggi
MoMe
MoE
42
1
0
20 Sep 2024
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Nikolas Gritsch
Qizhen Zhang
Acyr F. Locatelli
Sara Hooker
A. Ustun
MoE
50
1
0
28 Aug 2024
Parameter-Efficient Quantized Mixture-of-Experts Meets Vision-Language Instruction Tuning for Semiconductor Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Chidaksh Ravuru
Geethan Sannidhi
Venkataramana Runkana
35
0
0
27 Aug 2024
Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings
Sagar Srinivas Sakhinana
Geethan Sannidhi
Chidaksh Ravuru
Venkataramana Runkana
AI4TS
20
0
0
24 Aug 2024
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing
Hao Zhou
Zhijun Wang
Shujian Huang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Weihua Luo
Jiajun Chen
CLL
MoE
49
5
0
21 Aug 2024
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Prateek Yadav
Colin Raffel
Mohammed Muqeeth
Lucas Page-Caccia
Haokun Liu
Tianlong Chen
Mohit Bansal
Leshem Choshen
Alessandro Sordoni
MoMe
38
21
0
13 Aug 2024
Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation
Jingjing Xie
Yuxin Zhang
Mingbao Lin
Liujuan Cao
Rongrong Ji
MQ
25
3
0
07 Aug 2024
MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts
Lin Ning
Harsh Lara
Meiqi Guo
Abhinav Rastogi
MoMe
MoE
24
1
0
02 Aug 2024
Low-Rank Interconnected Adaptation Across Layers
Yibo Zhong
Yao Zhou
OffRL
MoE
38
1
0
13 Jul 2024
Foundation Model Engineering: Engineering Foundation Models Just as Engineering Software
Dezhi Ran
Mengzhou Wu
Wei Yang
Tao Xie
AI4CE
27
1
0
11 Jul 2024
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
45
14
0
08 Jul 2024
Mixture of A Million Experts
Xu Owen He
MoE
31
25
0
04 Jul 2024
Lateralization LoRA: Interleaved Instruction Tuning with Modality-Specialized Adaptations
Zhiyang Xu
Minqian Liu
Ying Shen
Joy Rimchala
Jiaxin Zhang
Qifan Wang
Yu Cheng
Lifu Huang
VLM
37
2
0
04 Jul 2024
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models
Renzhi Wang
Piji Li
KELM
CLL
37
7
0
28 Jun 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Di Zhang
Xi Li
MoE
41
2
0
28 Jun 2024
Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning
Ziyu Zhao
Leilei Gan
Guoyin Wang
Yuwei Hu
Tao Shen
Hongxia Yang
Kun Kuang
Fei Wu
MoE
MoMe
32
11
0
24 Jun 2024
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models
Zihao Zeng
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
MoE
31
7
0
19 Jun 2024
MoE-RBench
\texttt{MoE-RBench}
MoE-RBench
: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen
Xinyu Zhao
Tianlong Chen
Yu Cheng
MoE
62
5
0
17 Jun 2024
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference
Jihwan Bang
Juntae Lee
Kyuhong Shim
Seunghan Yang
Simyung Chang
29
5
0
11 Jun 2024
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
Renzhi Wang
Piji Li
KELM
32
3
0
29 May 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Page-Caccia
Alessandro Sordoni
MoMe
32
30
0
18 May 2024
Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model
Joo Young Choi
Jaesung R. Park
Inkyu Park
Jaewoong Cho
Albert No
Ernest K. Ryu
AI4CE
35
4
0
07 May 2024
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors
Yuan Tang
Xu Han
Xianzhi Li
Qiao Yu
Yixue Hao
Long Hu
Min Chen
29
14
0
02 May 2024
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts
Zefang Liu
Jiahua Luo
MoE
KELM
33
11
0
01 May 2024
1
2
Next