Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1312.4314
Cited By
Learning Factored Representations in a Deep Mixture of Experts
16 December 2013
David Eigen
MarcÁurelio Ranzato
Ilya Sutskever
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Factored Representations in a Deep Mixture of Experts"
50 / 82 papers shown
Title
The power of fine-grained experts: Granularity boosts expressivity in Mixture of Experts
Enric Boix Adserà
Philippe Rigollet
MoE
28
0
0
11 May 2025
MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance
Xing Hu
Zhixuan Chen
Dawei Yang
Zukang Xu
Chen Xu
Zhihang Yuan
Sifan Zhou
Jiangyong Yu
MoE
MQ
44
0
0
02 May 2025
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Mohan Zhang
Pingzhi Li
Jie Peng
Mufan Qiu
Tianlong Chen
MoE
50
0
0
02 Apr 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
153
2
0
10 Mar 2025
Efficient Algorithms for Verifying Kruskal Rank in Sparse Linear Regression and Related Applications
Fengqin Zhou
62
0
0
06 Mar 2025
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura
Takuya Akiba
Kazuki Fujii
Yusuke Oda
Rio Yokota
Jun Suzuki
MoMe
MoE
94
1
0
26 Feb 2025
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
153
14
0
20 Feb 2025
SCFCRC: Simultaneously Counteract Feature Camouflage and Relation Camouflage for Fraud Detection
Xuzhi Zhang
Zhuangzhuang Ye
GuoPing Zhao
Jianing Wang
Xiaohong Su
29
0
0
21 Jan 2025
Task Singular Vectors: Reducing Task Interference in Model Merging
Antonio Andrea Gargiulo
Donato Crisostomi
Maria Sofia Bucarelli
Simone Scardapane
Fabrizio Silvestri
Emanuele Rodolà
MoMe
87
9
0
26 Nov 2024
ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts
Xumeng Han
Longhui Wei
Zhiyang Dou
Zipeng Wang
Chenhui Qiang
Xin He
Yingfei Sun
Zhenjun Han
Qi Tian
MoE
45
3
0
21 Oct 2024
TradExpert: Revolutionizing Trading with Mixture of Expert LLMs
Qianggang Ding
Haochen Shi
Jiadong Guo
Bang Liu
AIFin
43
3
0
16 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
Jiaheng Liu
MoE
39
0
0
14 Oct 2024
Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts
Minh Le
Chau Nguyen
Huy Nguyen
Quyen Tran
Trung Le
Nhat Ho
44
4
0
03 Oct 2024
Gradient-free variational learning with conditional mixture networks
Conor Heins
Hao Wu
Dimitrije Marković
Alexander Tschantz
Jeff Beck
Christopher L. Buckley
BDL
31
2
0
29 Aug 2024
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
Quang H. Nguyen
Duy C. Hoang
Juliette Decugis
Saurav Manchanda
Nitesh V. Chawla
Khoa D. Doan
Khoa D. Doan
45
6
0
15 Jul 2024
Low-Rank Interconnected Adaptation Across Layers
Yibo Zhong
Yao Zhou
OffRL
MoE
48
1
0
13 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
52
18
0
08 Jul 2024
Submodular Framework for Structured-Sparse Optimal Transport
Piyushi Manupriya
Pratik Jawanpuria
Karthik S. Gurumoorthy
SakethaNath Jagarlapudi
Bamdev Mishra
OT
97
0
0
07 Jun 2024
Ensembling Diffusion Models via Adaptive Feature Aggregation
Cong Wang
Kuan Tian
Yonghang Guan
Jun Zhang
Zhiwei Jiang
Fei Shen
Xiao Han
44
5
0
27 May 2024
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Yongxin Guo
Zhenglin Cheng
Xiaoying Tang
Tao R. Lin
Tao Lin
MoE
66
7
0
23 May 2024
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Yunxin Li
Shenyuan Jiang
Baotian Hu
Longyue Wang
Wanqi Zhong
Wenhan Luo
Lin Ma
Min-Ling Zhang
MoE
46
30
0
18 May 2024
DiPaCo: Distributed Path Composition
Arthur Douillard
Qixuang Feng
Andrei A. Rusu
A. Kuncoro
Yani Donchev
Rachita Chhaparia
Ionel Gog
MarcÁurelio Ranzato
Jiajun Shen
Arthur Szlam
MoE
48
2
0
15 Mar 2024
Video Relationship Detection Using Mixture of Experts
A. Shaabana
Zahra Gharaee
Paul Fieguth
34
1
0
06 Mar 2024
LLMBind: A Unified Modality-Task Integration Framework
Bin Zhu
Munan Ning
Peng Jin
Bin Lin
Jinfa Huang
...
Junwu Zhang
Zhenyu Tang
Mingjun Pan
Xing Zhou
Li-ming Yuan
MLLM
40
6
0
22 Feb 2024
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Wenhao Zheng
Dongsheng Peng
Hongxia Xu
Yun-Qing Li
Hongtu Zhu
Tianfan Fu
Huaxiu Yao
Huaxiu Yao
50
5
0
09 Feb 2024
M
3
^3
3
TN: Multi-gate Mixture-of-Experts based Multi-valued Treatment Network for Uplift Modeling
Zexu Sun
Xu Chen
27
3
0
24 Jan 2024
Semantic Scene Segmentation for Robotics
Juana Valeria Hurtado
Abhinav Valada
VLM
SSeg
40
27
0
15 Jan 2024
Enhancing Molecular Property Prediction via Mixture of Collaborative Experts
Xu Yao
Shuang Liang
Songqiao Han
Hailiang Huang
29
4
0
06 Dec 2023
Conditional Prompt Tuning for Multimodal Fusion
Ruixia Jiang
Lingbo Liu
Changwen Chen
25
0
0
28 Nov 2023
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer
Boan Liu
Liang Ding
Li Shen
Keqin Peng
Yu Cao
Dazhao Cheng
Dacheng Tao
MoE
36
7
0
15 Oct 2023
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Wenyan Cong
Hanxue Liang
Peihao Wang
Zhiwen Fan
Tianlong Chen
M. Varma
Yi Wang
Zhangyang Wang
MoE
37
21
0
22 Aug 2023
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Yihua Zhang
Ruisi Cai
Tianlong Chen
Guanhua Zhang
Huan Zhang
Pin-Yu Chen
Shiyu Chang
Zhangyang Wang
Sijia Liu
MoE
AAML
OOD
34
16
0
19 Aug 2023
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
Hanrong Ye
Dan Xu
MoE
42
26
0
28 Jul 2023
MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition
Huanzhang Dou
Pengyi Zhang
Wei Su
Yunlong Yu
Xi Li
CVBM
29
31
0
06 Jun 2023
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models
Sheng Shen
Le Hou
Yan-Quan Zhou
Nan Du
Shayne Longpre
...
Vincent Zhao
Hongkun Yu
Kurt Keutzer
Trevor Darrell
Denny Zhou
ALM
MoE
38
54
0
24 May 2023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Leo Liu
Tim Dettmers
Xi Lin
Ves Stoyanov
Xian Li
MoE
26
9
0
23 May 2023
Revisiting Single-gated Mixtures of Experts
Amelie Royer
I. Karmanov
Andrii Skliar
B. Bejnordi
Tijmen Blankevoort
MoE
MoMe
38
6
0
11 Apr 2023
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
38
4
0
20 Mar 2023
HiNet: Novel Multi-Scenario & Multi-Task Learning with Hierarchical Information Extraction
Jie Zhou
Xia Cao
Wenhao Li
Lin Bo
Kun Zhang
Chuan Luo
Qian Yu
29
24
0
10 Mar 2023
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
32
73
0
22 Feb 2023
Improving Domain Generalization with Domain Relations
Huaxiu Yao
Xinyu Yang
Xinyi Pan
Shengchao Liu
Pang Wei Koh
Chelsea Finn
OOD
AI4CE
52
8
0
06 Feb 2023
Synthesizing Physical Character-Scene Interactions
Mohamed Hassan
Yunrong Guo
Tingwu Wang
Michael J. Black
Sanja Fidler
Xue Bin Peng
30
80
0
02 Feb 2023
Automatically Extracting Information in Medical Dialogue: Expert System And Attention for Labelling
Xinshi Wang
Daniel Tang
26
2
0
28 Nov 2022
Spatial Mixture-of-Experts
Nikoli Dryden
Torsten Hoefler
MoE
34
9
0
24 Nov 2022
M
3
^3
3
ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
Hanxue Liang
Zhiwen Fan
Rishov Sarkar
Ziyu Jiang
Tianlong Chen
Kai Zou
Yu Cheng
Cong Hao
Zhangyang Wang
MoE
42
81
0
26 Oct 2022
Towards Out-of-Distribution Sequential Event Prediction: A Causal Treatment
Chenxiao Yang
Qitian Wu
Qingsong Wen
Zhiqiang Zhou
Liang Sun
Junchi Yan
OODD
OOD
25
20
0
24 Oct 2022
On the Adversarial Robustness of Mixture of Experts
J. Puigcerver
Rodolphe Jenatton
C. Riquelme
Pranjal Awasthi
Srinadh Bhojanapalli
OOD
AAML
MoE
45
18
0
19 Oct 2022
Redesigning Multi-Scale Neural Network for Crowd Counting
Zhipeng Du
Miaojing Shi
Jiankang Deng
S. Zafeiriou
31
44
0
04 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
30
53
0
04 Aug 2022
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Yang Shu
Zhangjie Cao
Ziyang Zhang
Jianmin Wang
Mingsheng Long
17
4
0
08 Jun 2022
1
2
Next