ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1312.4314
  4. Cited By
Learning Factored Representations in a Deep Mixture of Experts

Learning Factored Representations in a Deep Mixture of Experts

16 December 2013
David Eigen
MarcÁurelio Ranzato
Ilya Sutskever
    MoE
ArXivPDFHTML

Papers citing "Learning Factored Representations in a Deep Mixture of Experts"

50 / 82 papers shown
Title
The power of fine-grained experts: Granularity boosts expressivity in Mixture of Experts
The power of fine-grained experts: Granularity boosts expressivity in Mixture of Experts
Enric Boix Adserà
Philippe Rigollet
MoE
28
0
0
11 May 2025
MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance
MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance
Xing Hu
Zhixuan Chen
Dawei Yang
Zukang Xu
Chen Xu
Zhihang Yuan
Sifan Zhou
Jiangyong Yu
MoE
MQ
44
0
0
02 May 2025
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Mohan Zhang
Pingzhi Li
Jie Peng
Mufan Qiu
Tianlong Chen
MoE
50
0
0
02 Apr 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
153
2
0
10 Mar 2025
Efficient Algorithms for Verifying Kruskal Rank in Sparse Linear Regression and Related Applications
Fengqin Zhou
62
0
0
06 Mar 2025
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura
Takuya Akiba
Kazuki Fujii
Yusuke Oda
Rio Yokota
Jun Suzuki
MoMe
MoE
94
1
0
26 Feb 2025
Theory on Mixture-of-Experts in Continual Learning
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
153
14
0
20 Feb 2025
SCFCRC: Simultaneously Counteract Feature Camouflage and Relation Camouflage for Fraud Detection
SCFCRC: Simultaneously Counteract Feature Camouflage and Relation Camouflage for Fraud Detection
Xuzhi Zhang
Zhuangzhuang Ye
GuoPing Zhao
Jianing Wang
Xiaohong Su
29
0
0
21 Jan 2025
Task Singular Vectors: Reducing Task Interference in Model Merging
Task Singular Vectors: Reducing Task Interference in Model Merging
Antonio Andrea Gargiulo
Donato Crisostomi
Maria Sofia Bucarelli
Simone Scardapane
Fabrizio Silvestri
Emanuele Rodolà
MoMe
87
9
0
26 Nov 2024
ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts
ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts
Xumeng Han
Longhui Wei
Zhiyang Dou
Zipeng Wang
Chenhui Qiang
Xin He
Yingfei Sun
Zhenjun Han
Qi Tian
MoE
45
3
0
21 Oct 2024
TradExpert: Revolutionizing Trading with Mixture of Expert LLMs
TradExpert: Revolutionizing Trading with Mixture of Expert LLMs
Qianggang Ding
Haochen Shi
Jiadong Guo
Bang Liu
AIFin
43
3
0
16 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
Jiaheng Liu
MoE
39
0
0
14 Oct 2024
Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts
Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts
Minh Le
Chau Nguyen
Huy Nguyen
Quyen Tran
Trung Le
Nhat Ho
44
4
0
03 Oct 2024
Gradient-free variational learning with conditional mixture networks
Gradient-free variational learning with conditional mixture networks
Conor Heins
Hao Wu
Dimitrije Marković
Alexander Tschantz
Jeff Beck
Christopher L. Buckley
BDL
31
2
0
29 Aug 2024
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
Quang H. Nguyen
Duy C. Hoang
Juliette Decugis
Saurav Manchanda
Nitesh V. Chawla
Khoa D. Doan
Khoa D. Doan
45
6
0
15 Jul 2024
Low-Rank Interconnected Adaptation Across Layers
Low-Rank Interconnected Adaptation Across Layers
Yibo Zhong
Yao Zhou
OffRL
MoE
48
1
0
13 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
52
18
0
08 Jul 2024
Submodular Framework for Structured-Sparse Optimal Transport
Submodular Framework for Structured-Sparse Optimal Transport
Piyushi Manupriya
Pratik Jawanpuria
Karthik S. Gurumoorthy
SakethaNath Jagarlapudi
Bamdev Mishra
OT
97
0
0
07 Jun 2024
Ensembling Diffusion Models via Adaptive Feature Aggregation
Ensembling Diffusion Models via Adaptive Feature Aggregation
Cong Wang
Kuan Tian
Yonghang Guan
Jun Zhang
Zhiwei Jiang
Fei Shen
Xiao Han
44
5
0
27 May 2024
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Yongxin Guo
Zhenglin Cheng
Xiaoying Tang
Tao R. Lin
Tao Lin
MoE
66
7
0
23 May 2024
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Yunxin Li
Shenyuan Jiang
Baotian Hu
Longyue Wang
Wanqi Zhong
Wenhan Luo
Lin Ma
Min-Ling Zhang
MoE
46
30
0
18 May 2024
DiPaCo: Distributed Path Composition
DiPaCo: Distributed Path Composition
Arthur Douillard
Qixuang Feng
Andrei A. Rusu
A. Kuncoro
Yani Donchev
Rachita Chhaparia
Ionel Gog
MarcÁurelio Ranzato
Jiajun Shen
Arthur Szlam
MoE
48
2
0
15 Mar 2024
Video Relationship Detection Using Mixture of Experts
Video Relationship Detection Using Mixture of Experts
A. Shaabana
Zahra Gharaee
Paul Fieguth
34
1
0
06 Mar 2024
LLMBind: A Unified Modality-Task Integration Framework
LLMBind: A Unified Modality-Task Integration Framework
Bin Zhu
Munan Ning
Peng Jin
Bin Lin
Jinfa Huang
...
Junwu Zhang
Zhenyu Tang
Mingjun Pan
Xing Zhou
Li-ming Yuan
MLLM
40
6
0
22 Feb 2024
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Wenhao Zheng
Dongsheng Peng
Hongxia Xu
Yun-Qing Li
Hongtu Zhu
Tianfan Fu
Huaxiu Yao
Huaxiu Yao
50
5
0
09 Feb 2024
M$^3$TN: Multi-gate Mixture-of-Experts based Multi-valued Treatment
  Network for Uplift Modeling
M3^33TN: Multi-gate Mixture-of-Experts based Multi-valued Treatment Network for Uplift Modeling
Zexu Sun
Xu Chen
27
3
0
24 Jan 2024
Semantic Scene Segmentation for Robotics
Semantic Scene Segmentation for Robotics
Juana Valeria Hurtado
Abhinav Valada
VLM
SSeg
44
27
0
15 Jan 2024
Enhancing Molecular Property Prediction via Mixture of Collaborative
  Experts
Enhancing Molecular Property Prediction via Mixture of Collaborative Experts
Xu Yao
Shuang Liang
Songqiao Han
Hailiang Huang
29
4
0
06 Dec 2023
Conditional Prompt Tuning for Multimodal Fusion
Conditional Prompt Tuning for Multimodal Fusion
Ruixia Jiang
Lingbo Liu
Changwen Chen
28
0
0
28 Nov 2023
Diversifying the Mixture-of-Experts Representation for Language Models
  with Orthogonal Optimizer
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer
Boan Liu
Liang Ding
Li Shen
Keqin Peng
Yu Cao
Dazhao Cheng
Dacheng Tao
MoE
36
7
0
15 Oct 2023
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer
  with Mixture-of-View-Experts
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Wenyan Cong
Hanxue Liang
Peihao Wang
Zhiwen Fan
Tianlong Chen
M. Varma
Yi Wang
Zhangyang Wang
MoE
37
21
0
22 Aug 2023
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Yihua Zhang
Ruisi Cai
Tianlong Chen
Guanhua Zhang
Huan Zhang
Pin-Yu Chen
Shiyu Chang
Zhangyang Wang
Sijia Liu
MoE
AAML
OOD
34
16
0
19 Aug 2023
TaskExpert: Dynamically Assembling Multi-Task Representations with
  Memorial Mixture-of-Experts
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
Hanrong Ye
Dan Xu
MoE
42
26
0
28 Jul 2023
MetaGait: Learning to Learn an Omni Sample Adaptive Representation for
  Gait Recognition
MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition
Huanzhang Dou
Pengyi Zhang
Wei Su
Yunlong Yu
Xi Li
CVBM
29
31
0
06 Jun 2023
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for
  Large Language Models
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models
Sheng Shen
Le Hou
Yan-Quan Zhou
Nan Du
Shayne Longpre
...
Vincent Zhao
Hongkun Yu
Kurt Keutzer
Trevor Darrell
Denny Zhou
ALM
MoE
38
54
0
24 May 2023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining
  Large Language Model
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Leo Liu
Tim Dettmers
Xi Lin
Ves Stoyanov
Xian Li
MoE
26
9
0
23 May 2023
Revisiting Single-gated Mixtures of Experts
Revisiting Single-gated Mixtures of Experts
Amelie Royer
I. Karmanov
Andrii Skliar
B. Bejnordi
Tijmen Blankevoort
MoE
MoMe
38
6
0
11 Apr 2023
Memorization Capacity of Neural Networks with Conditional Computation
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
38
4
0
20 Mar 2023
HiNet: Novel Multi-Scenario & Multi-Task Learning with Hierarchical
  Information Extraction
HiNet: Novel Multi-Scenario & Multi-Task Learning with Hierarchical Information Extraction
Jie Zhou
Xia Cao
Wenhao Li
Lin Bo
Kun Zhang
Chuan Luo
Qian Yu
29
24
0
10 Mar 2023
Modular Deep Learning
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
32
73
0
22 Feb 2023
Improving Domain Generalization with Domain Relations
Improving Domain Generalization with Domain Relations
Huaxiu Yao
Xinyu Yang
Xinyi Pan
Shengchao Liu
Pang Wei Koh
Chelsea Finn
OOD
AI4CE
52
8
0
06 Feb 2023
Synthesizing Physical Character-Scene Interactions
Synthesizing Physical Character-Scene Interactions
Mohamed Hassan
Yunrong Guo
Tingwu Wang
Michael J. Black
Sanja Fidler
Xue Bin Peng
33
80
0
02 Feb 2023
Automatically Extracting Information in Medical Dialogue: Expert System
  And Attention for Labelling
Automatically Extracting Information in Medical Dialogue: Expert System And Attention for Labelling
Xinshi Wang
Daniel Tang
26
2
0
28 Nov 2022
Spatial Mixture-of-Experts
Spatial Mixture-of-Experts
Nikoli Dryden
Torsten Hoefler
MoE
34
9
0
24 Nov 2022
M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
  Learning with Model-Accelerator Co-design
M3^33ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
Hanxue Liang
Zhiwen Fan
Rishov Sarkar
Ziyu Jiang
Tianlong Chen
Kai Zou
Yu Cheng
Cong Hao
Zhangyang Wang
MoE
42
81
0
26 Oct 2022
Towards Out-of-Distribution Sequential Event Prediction: A Causal
  Treatment
Towards Out-of-Distribution Sequential Event Prediction: A Causal Treatment
Chenxiao Yang
Qitian Wu
Qingsong Wen
Zhiqiang Zhou
Liang Sun
Junchi Yan
OODD
OOD
25
20
0
24 Oct 2022
On the Adversarial Robustness of Mixture of Experts
On the Adversarial Robustness of Mixture of Experts
J. Puigcerver
Rodolphe Jenatton
C. Riquelme
Pranjal Awasthi
Srinadh Bhojanapalli
OOD
AAML
MoE
45
18
0
19 Oct 2022
Redesigning Multi-Scale Neural Network for Crowd Counting
Redesigning Multi-Scale Neural Network for Crowd Counting
Zhipeng Du
Miaojing Shi
Jiankang Deng
S. Zafeiriou
34
44
0
04 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
30
53
0
04 Aug 2022
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Yang Shu
Zhangjie Cao
Ziyang Zhang
Jianmin Wang
Mingsheng Long
17
4
0
08 Jun 2022
12
Next