ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.03745
  4. Cited By
Soft Merging of Experts with Adaptive Routing

Soft Merging of Experts with Adaptive Routing

6 June 2023
Mohammed Muqeeth
Haokun Liu
Colin Raffel
    MoMe
    MoE
ArXivPDFHTML

Papers citing "Soft Merging of Experts with Adaptive Routing"

15 / 15 papers shown
Title
FT-MoE: Sustainable-learning Mixture of Experts Model for Fault-Tolerant Computing with Multiple Tasks
FT-MoE: Sustainable-learning Mixture of Experts Model for Fault-Tolerant Computing with Multiple Tasks
Wenjing Xiao
Wenhao Song
Miaojiang Chen
Ruikun Luo
Min Chen
MoE
44
0
0
29 Apr 2025
CAMEx: Curvature-aware Merging of Experts
CAMEx: Curvature-aware Merging of Experts
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
63
2
0
26 Feb 2025
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Page-Caccia
Alessandro Sordoni
MoMe
27
30
0
18 May 2024
From Sparse to Soft Mixtures of Experts
From Sparse to Soft Mixtures of Experts
J. Puigcerver
C. Riquelme
Basil Mustafa
N. Houlsby
MoE
118
114
0
02 Aug 2023
$π$-Tuning: Transferring Multimodal Foundation Models with Optimal
  Multi-task Interpolation
πππ-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
Chengyue Wu
Teng Wang
Yixiao Ge
Zeyu Lu
Rui-Zhi Zhou
Ying Shan
Ping Luo
MoMe
70
35
0
27 Apr 2023
Git Re-Basin: Merging Models modulo Permutation Symmetries
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
239
313
0
11 Sep 2022
Linear Connectivity Reveals Generalization Strategies
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
232
45
0
24 May 2022
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
James Lee-Thorp
Joshua Ainslie
MoE
27
11
0
24 May 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
147
323
0
18 Feb 2022
PromptSource: An Integrated Development Environment and Repository for
  Natural Language Prompts
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen H. Bach
Victor Sanh
Zheng-Xin Yong
Albert Webson
Colin Raffel
...
Khalid Almubarak
Xiangru Tang
Dragomir R. Radev
Mike Tian-Jian Jiang
Alexander M. Rush
VLM
212
335
0
02 Feb 2022
Tricks for Training Sparse Translation Models
Tricks for Training Sparse Translation Models
Dheeru Dua
Shruti Bhosale
Vedanuj Goswami
James Cross
M. Lewis
Angela Fan
MoE
139
19
0
15 Oct 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
119
104
0
24 Sep 2021
Unbiased Gradient Estimation with Balanced Assignments for Mixtures of
  Experts
Unbiased Gradient Estimation with Balanced Assignments for Mixtures of Experts
W. Kool
Chris J. Maddison
A. Mnih
14
10
0
24 Sep 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1