ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.05444
  4. Cited By
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient
  MoE for Instruction Tuning

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

11 September 2023
Ted Zadouri
A. Ustun
Arash Ahmadian
Beyza Ermics
Acyr F. Locatelli
Sara Hooker
    MoE
ArXivPDFHTML

Papers citing "Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning"

50 / 80 papers shown
Title
A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning
A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning
Junzhou Xu
Boyu Diao
MoE
37
0
0
06 May 2025
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Song Wang
Xiaolu Liu
Lingdong Kong
Jianyun Xu
Chunyong Hu
Gongfan Fang
Wentong Li
Jianke Zhu
Xinchao Wang
22
0
0
22 Apr 2025
LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts
LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts
Yimu Wang
Mozhgan Nasr Azadani
Sean Sedwards
Krzysztof Czarnecki
MLLM
MoE
52
0
0
07 Apr 2025
Investigating and Mitigating Stereotype-aware Unfairness in LLM-based Recommendations
Investigating and Mitigating Stereotype-aware Unfairness in LLM-based Recommendations
Zihuai Zhao
Wenqi Fan
Yao Wu
Qing Li
75
1
0
05 Apr 2025
MetaLoRA: Tensor-Enhanced Adaptive Low-Rank Fine-tuning
MetaLoRA: Tensor-Enhanced Adaptive Low-Rank Fine-tuning
Maolin Wang
Xiangyu Zhao
AI4CE
41
0
0
01 Apr 2025
Mixture of Routers
Mixture of Routers
Jia-Chen Zhang
Yu-Jie Xiong
Xi-He Qiu
Chun-Ming Xia
Fei Dai
MoE
59
0
0
30 Mar 2025
Efficient Model Development through Fine-tuning Transfer
Efficient Model Development through Fine-tuning Transfer
Pin-Jie Lin
Rishab Balasubramanian
Fengyuan Liu
Nikhil Kandpal
Tu Vu
59
0
0
25 Mar 2025
Merge then Realign: Simple and Effective Modality-Incremental Continual Learning for Multimodal LLMs
Dingkun Zhang
Shuhan Qi
Xinyu Xiao
Kehai Chen
Xuan Wang
CLL
MoMe
59
0
0
08 Mar 2025
Multi-Level Collaboration in Model Merging
Qi Li
Runpeng Yu
Xinchao Wang
MoMe
FedML
91
0
0
03 Mar 2025
Sample Selection via Contrastive Fragmentation for Noisy Label Regression
Sample Selection via Contrastive Fragmentation for Noisy Label Regression
C. Kim
Sangwoo Moon
Jihwan Moon
Dongyeon Woo
Gunhee Kim
NoLa
52
0
0
25 Feb 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Chenghao Fan
Zhenyi Lu
Sichen Liu
Xiaoye Qu
Wei Wei
Chengfeng Gu
Yu-Xi Cheng
MoE
91
0
0
24 Feb 2025
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Zhili Liu
Yunhao Gou
Kai Chen
Lanqing Hong
Jiahui Gao
...
Yu Zhang
Zhenguo Li
Xin Jiang
Q. Liu
James T. Kwok
MoE
96
9
0
20 Feb 2025
A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models
A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models
Mengyang Sun
Yihao Wang
Tao Feng
Dan Zhang
Yifan Zhu
J. Tang
MoE
29
0
0
20 Feb 2025
Theory on Mixture-of-Experts in Continual Learning
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
151
14
0
20 Feb 2025
Ensembles of Low-Rank Expert Adapters
Ensembles of Low-Rank Expert Adapters
Yinghao Li
Vianne Gao
Chao Zhang
MohamadAli Torkamani
60
0
0
31 Jan 2025
Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning
Ziyu Zhao
Yixiao Zhou
Didi Zhu
Tao Shen
X. Wang
Jing Su
Kun Kuang
Zhongyu Wei
Fei Wu
Yu Cheng
MoE
26
1
0
28 Jan 2025
GraphLoRA: Empowering LLMs Fine-Tuning via Graph Collaboration of MoE
GraphLoRA: Empowering LLMs Fine-Tuning via Graph Collaboration of MoE
Ting Bai
Yue Yu
Le Huang
Zenan Xu
Zhe Zhao
Chuan Shi
MoE
123
0
0
18 Dec 2024
Investigating Mixture of Experts in Dense Retrieval
Investigating Mixture of Experts in Dense Retrieval
Effrosyni Sokli
Pranav Kasela
Georgios Peikos
G. Pasi
MoE
67
1
0
16 Dec 2024
MoSLD: An Extremely Parameter-Efficient Mixture-of-Shared LoRAs for
  Multi-Task Learning
MoSLD: An Extremely Parameter-Efficient Mixture-of-Shared LoRAs for Multi-Task Learning
Lulu Zhao
Weihao Zeng
Xiaofeng Shi
Hua Zhou
MoMe
MoE
67
0
0
12 Dec 2024
MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task
  Learning
MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning
Yufei Ma
Zihan Liang
Huangyu Dai
B. Chen
D. Gao
...
Linbo Jin
Wen Jiang
Guannan Zhang
Xiaoyan Cai
Libin Yang
MoE
MoMe
94
1
0
10 Dec 2024
PMoL: Parameter Efficient MoE for Preference Mixing of LLM Alignment
PMoL: Parameter Efficient MoE for Preference Mixing of LLM Alignment
Dongxu Liu
Bing Xu
Yinzhuo Chen
Bufan Xu
Wenpeng Lu
Muyun Yang
T. Zhao
MoE
39
1
0
02 Nov 2024
MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced
  Multi-Task Learning
MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Xujia Wang
Haiyan Zhao
Shuo Wang
Hanqing Wang
Zhiyuan Liu
MoMe
MoE
30
0
0
30 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
L. Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
44
3
0
24 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Mohit Bansal
Tianlong Chen
MoMe
MoE
25
1
0
09 Oct 2024
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture
  of Shards
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards
Sheng Wang
Liheng Chen
Pengan Chen
Jingwei Dong
Boyang Xue
Jiyue Jiang
Lingpeng Kong
Chuan Wu
MoE
29
7
0
01 Oct 2024
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task
  Learning Via Connector-MoE
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE
Xun Zhu
Ying Hu
Fanbin Mo
Miao Li
Ji Wu
44
8
0
26 Sep 2024
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
Dongyang Fan
Bettina Messmer
N. Doikov
Martin Jaggi
MoMe
MoE
42
1
0
20 Sep 2024
Nexus: Specialization meets Adaptability for Efficiently Training
  Mixture of Experts
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Nikolas Gritsch
Qizhen Zhang
Acyr F. Locatelli
Sara Hooker
A. Ustun
MoE
50
1
0
28 Aug 2024
Parameter-Efficient Quantized Mixture-of-Experts Meets Vision-Language
  Instruction Tuning for Semiconductor Electron Micrograph Analysis
Parameter-Efficient Quantized Mixture-of-Experts Meets Vision-Language Instruction Tuning for Semiconductor Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Chidaksh Ravuru
Geethan Sannidhi
Venkataramana Runkana
35
0
0
27 Aug 2024
Advancing Enterprise Spatio-Temporal Forecasting Applications: Data
  Mining Meets Instruction Tuning of Language Models For Multi-modal Time
  Series Analysis in Low-Resource Settings
Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings
Sagar Srinivas Sakhinana
Geethan Sannidhi
Chidaksh Ravuru
Venkataramana Runkana
AI4TS
20
0
0
24 Aug 2024
MoE-LPR: Multilingual Extension of Large Language Models through
  Mixture-of-Experts with Language Priors Routing
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing
Hao Zhou
Zhijun Wang
Shujian Huang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Weihua Luo
Jiajun Chen
CLL
MoE
49
5
0
21 Aug 2024
A Survey on Model MoErging: Recycling and Routing Among Specialized
  Experts for Collaborative Learning
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Prateek Yadav
Colin Raffel
Mohammed Muqeeth
Lucas Page-Caccia
Haokun Liu
Tianlong Chen
Mohit Bansal
Leshem Choshen
Alessandro Sordoni
MoMe
38
21
0
13 Aug 2024
Advancing Multimodal Large Language Models with Quantization-Aware Scale
  Learning for Efficient Adaptation
Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation
Jingjing Xie
Yuxin Zhang
Mingbao Lin
Liujuan Cao
Rongrong Ji
MQ
25
3
0
07 Aug 2024
MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a
  Mixture of Dyadic Experts
MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts
Lin Ning
Harsh Lara
Meiqi Guo
Abhinav Rastogi
MoMe
MoE
24
1
0
02 Aug 2024
Low-Rank Interconnected Adaptation Across Layers
Low-Rank Interconnected Adaptation Across Layers
Yibo Zhong
Yao Zhou
OffRL
MoE
38
1
0
13 Jul 2024
Foundation Model Engineering: Engineering Foundation Models Just as
  Engineering Software
Foundation Model Engineering: Engineering Foundation Models Just as Engineering Software
Dezhi Ran
Mengzhou Wu
Wei Yang
Tao Xie
AI4CE
27
1
0
11 Jul 2024
On the Limitations of Compute Thresholds as a Governance Strategy
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
45
14
0
08 Jul 2024
Mixture of A Million Experts
Mixture of A Million Experts
Xu Owen He
MoE
31
25
0
04 Jul 2024
Lateralization LoRA: Interleaved Instruction Tuning with
  Modality-Specialized Adaptations
Lateralization LoRA: Interleaved Instruction Tuning with Modality-Specialized Adaptations
Zhiyang Xu
Minqian Liu
Ying Shen
Joy Rimchala
Jiaxin Zhang
Qifan Wang
Yu Cheng
Lifu Huang
VLM
37
2
0
04 Jul 2024
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of
  Large Language Models
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models
Renzhi Wang
Piji Li
KELM
CLL
37
7
0
28 Jun 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Di Zhang
Xi Li
MoE
41
2
0
28 Jun 2024
Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine
  Learning
Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning
Ziyu Zhao
Leilei Gan
Guoyin Wang
Yuwei Hu
Tao Shen
Hongxia Yang
Kun Kuang
Fei Wu
MoE
MoMe
32
11
0
24 Jun 2024
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts
  Language Models
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models
Zihao Zeng
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
MoE
31
7
0
19 Jun 2024
$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with
  Sparse Mixture-of-Experts
MoE-RBench\texttt{MoE-RBench}MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen
Xinyu Zhao
Tianlong Chen
Yu Cheng
MoE
62
5
0
17 Jun 2024
Crayon: Customized On-Device LLM via Instant Adapter Blending and
  Edge-Server Hybrid Inference
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference
Jihwan Bang
Juntae Lee
Kyuhong Shim
Seunghan Yang
Simyung Chang
29
5
0
11 Jun 2024
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
Renzhi Wang
Piji Li
KELM
32
3
0
29 May 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Page-Caccia
Alessandro Sordoni
MoMe
32
30
0
18 May 2024
Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your
  Diffusion Model
Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model
Joo Young Choi
Jaesung R. Park
Inkyu Park
Jaewoong Cho
Albert No
Ernest K. Ryu
AI4CE
35
4
0
07 May 2024
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language
  Models using 2D Priors
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors
Yuan Tang
Xu Han
Xianzhi Li
Qiao Yu
Yixue Hao
Long Hu
Min Chen
29
14
0
02 May 2024
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of
  Low-Rank Adaptation Experts
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts
Zefang Liu
Jiahua Luo
MoE
KELM
33
11
0
01 May 2024
12
Next