Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2210.14793
Cited By
M
3
^3
3
ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
Neural Information Processing Systems (NeurIPS), 2022
26 October 2022
Hanxue Liang
Zhiwen Fan
Rishov Sarkar
Ziyu Jiang
Tianlong Chen
Kai Zou
Yu Cheng
Cong Hao
Zinan Lin
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
Github (118★)
Papers citing
"M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design"
21 / 71 papers shown
Title
On Least Square Estimation in Softmax Gating Mixture of Experts
International Conference on Machine Learning (ICML), 2024
Huy Nguyen
Nhat Ho
Alessandro Rinaldo
211
21
0
05 Feb 2024
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
International Conference on Machine Learning (ICML), 2024
Huy Nguyen
Pedram Akbarian
Nhat Ho
MoE
199
17
0
25 Jan 2024
Efficient Deweather Mixture-of-Experts with Uncertainty-aware Feature-wise Linear Modulation
Rongyu Zhang
Yulin Luo
Jiaming Liu
Huanrui Yang
Zhen Dong
...
Tomoyuki Okuno
Yohei Nakata
Kurt Keutzer
Yuan Du
Shanghang Zhang
MoMe
MoE
194
6
0
27 Dec 2023
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape
Timothy R. McIntosh
Teo Susnjak
Tong Liu
Paul Watters
Malka N. Halgamuge
341
70
0
18 Dec 2023
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
Computer Vision and Pattern Recognition (CVPR), 2023
Jialin Wu
Xia Hu
Yaqing Wang
Bo Pang
Radu Soricut
MoE
186
31
0
01 Dec 2023
SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models
Conference on Machine Learning and Systems (MLSys), 2023
Zhixu Du
Shiyu Li
Yuhao Wu
Xiangyu Jiang
Jingwei Sun
Qilin Zheng
Yongkai Wu
Ang Li
Hai Helen Li
Yiran Chen
MoE
330
29
0
29 Oct 2023
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
International Conference on Machine Learning (ICML), 2023
Huy Nguyen
Pedram Akbarian
TrungTin Nguyen
Nhat Ho
217
17
0
22 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
International Conference on Learning Representations (ICLR), 2023
Pingzhi Li
Zhenyu Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
202
71
0
02 Oct 2023
Multi-task Learning with 3D-Aware Regularization
International Conference on Learning Representations (ICLR), 2023
Weihong Li
Jingyu Sun
A. Leonardis
Hakan Bilen
155
8
0
02 Oct 2023
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts
International Conference on Learning Representations (ICLR), 2023
Huy Nguyen
Pedram Akbarian
Fanqi Yan
Nhat Ho
MoE
259
23
0
25 Sep 2023
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Rui Kong
Yuanchun Li
Qingtian Feng
Weijun Wang
Xiaozhou Ye
Ye Ouyang
Lingyu Kong
Yunxin Liu
MoE
294
17
0
29 Aug 2023
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
IEEE International Conference on Computer Vision (ICCV), 2023
Wenyan Cong
Hanxue Liang
Peihao Wang
Zhiwen Fan
Tianlong Chen
M. Varma
Yi Wang
Zinan Lin
MoE
213
30
0
22 Aug 2023
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
IEEE International Conference on Computer Vision (ICCV), 2023
Hanrong Ye
Dan Xu
MoE
201
45
0
28 Jul 2023
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
IEEE International Conference on Computer Vision (ICCV), 2023
Yi-Syuan Chen
Yun-Zhu Song
Cheng Yu Yeo
Bei Liu
Jianlong Fu
Hong-Han Shuai
VLM
LRM
195
7
0
15 Jul 2023
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
Rishov Sarkar
Hanxue Liang
Zhiwen Fan
Zinan Lin
Cong Hao
MoE
224
34
0
30 May 2023
Demystifying Softmax Gating Function in Gaussian Mixture of Experts
Neural Information Processing Systems (NeurIPS), 2023
Huy Nguyen
TrungTin Nguyen
Nhat Ho
176
32
0
05 May 2023
AdaMTL: Adaptive Input-dependent Inference for Efficient Multi-Task Learning
Marina Neseem
Ahmed A. Agiza
Sherief Reda
123
9
0
17 Apr 2023
Graph Mixture of Experts: Learning on Large-Scale Graphs with Explicit Diversity Modeling
Neural Information Processing Systems (NeurIPS), 2023
Haotao Wang
Ziyu Jiang
Yuning You
Yan Han
Gaowen Liu
Jayanth Srinivasa
Ramana Rao Kompella
Zinan Lin
253
63
0
06 Apr 2023
Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers
Shiwei Liu
Zinan Lin
349
33
0
06 Feb 2023
Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners
Zitian Chen
Songlin Yang
Mingyu Ding
Zhenfang Chen
Hengshuang Zhao
E. Learned-Miller
Chuang Gan
MoE
92
17
0
15 Dec 2022
Accelerating Distributed MoE Training and Inference with Lina
USENIX Annual Technical Conference (USENIX ATC), 2022
Jiamin Li
Yimin Jiang
Yibo Zhu
Cong Wang
Hong-Yu Xu
MoE
175
99
0
31 Oct 2022
Previous
1
2