Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.12656
Cited By
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
20 February 2024
Hao Zhao
Zihan Qiu
Huijia Wu
Zili Wang
Zhaofeng He
Jie Fu
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts"
12 / 12 papers shown
Title
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference
Suraiya Tairin
Shohaib Mahmud
Haiying Shen
Anand Iyer
MoE
88
0
0
10 Mar 2025
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts
Zhenpeng Su
Xing Wu
Zijia Lin
Yizhe Xiong
Minxuan Lv
Guangyuan Ma
Hui Chen
Songlin Hu
Guiguang Ding
MoE
26
2
0
21 Oct 2024
Mixture of Diverse Size Experts
Manxi Sun
Wei Liu
Jian Luan
Pengzhi Gao
Bin Wang
MoE
21
1
0
18 Sep 2024
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu
Zeyu Huang
Shuang Cheng
Yizhi Zhou
Zili Wang
Ivan Titov
Jie Fu
MoE
68
2
0
13 Aug 2024
Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction
Yonggang Jin
Ge Zhang
Hao Zhao
Tianyu Zheng
Jiawei Guo
Liuyu Xiang
Shawn Yue
Stephen W. Huang
Zhaofeng He
Jie Fu
OffRL
27
4
0
06 Feb 2024
Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning
Hao Zhao
Jie Fu
Zhaofeng He
105
6
0
18 Oct 2023
Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer
A. Ustun
Arianna Bisazza
G. Bouma
Gertjan van Noord
Sebastian Ruder
44
32
0
24 May 2022
Multilingual Machine Translation with Hyper-Adapters
Christos Baziotis
Mikel Artetxe
James Cross
Shruti Bhosale
63
21
0
22 May 2022
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
147
326
0
18 Feb 2022
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,471
0
17 Apr 2017
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
170
3,508
0
10 Jun 2015
1