Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2111.11831
Cited By
SpeechMoE2: Mixture-of-Experts Model with Improved Routing
23 November 2021
Zhao You
Shulin Feng
Jane Polak Scowcroft
Dong Yu
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"SpeechMoE2: Mixture-of-Experts Model with Improved Routing"
22 / 22 papers shown
Title
Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps
Do Tien Hai
T. T. N. Mai
T. Nguyen
Nhat Ho
Binh T. Nguyen
Christopher Drovandi
120
0
0
14 Oct 2025
Omni-Router: Sharing Routing Decisions in Sparse Mixture-of-Experts for Speech Recognition
Zijin Gu
Tatiana Likhomanenko
Navdeep Jaitly
MoE
184
2
0
08 Jul 2025
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
Interspeech (Interspeech), 2024
Wei Liu
Jingyong Hou
Dong Yang
Muyong Cao
Tan Lee
450
2
0
10 Jan 2025
MoE-RBench
\texttt{MoE-RBench}
MoE-RBench
: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen
Xinyu Zhao
Tianlong Chen
Yu Cheng
MoE
248
6
0
17 Jun 2024
On Parameter Estimation in Deviated Gaussian Mixture of Experts
Huy Nguyen
Khai Nguyen
Nhat Ho
186
1
0
07 Feb 2024
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape
Timothy R. McIntosh
Teo Susnjak
Tong Liu
Paul Watters
Malka N. Halgamuge
355
73
0
18 Dec 2023
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
International Conference on Machine Learning (ICML), 2023
Huy Nguyen
Pedram Akbarian
TrungTin Nguyen
Nhat Ho
217
17
0
22 Oct 2023
Direct Neural Machine Translation with Task-level Mixture of Experts models
Isidora Chara Tourni
Subhajit Naskar
MoE
186
0
0
18 Oct 2023
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts
International Conference on Learning Representations (ICLR), 2023
Huy Nguyen
Pedram Akbarian
Fanqi Yan
Nhat Ho
MoE
303
23
0
25 Sep 2023
LLMCad: Fast and Scalable On-device Large Language Model Inference
Daliang Xu
Wangsong Yin
Xin Jin
Yanzhe Zhang
Shiyun Wei
Mengwei Xu
Xuanzhe Liu
165
67
0
08 Sep 2023
Learning When to Trust Which Teacher for Weakly Supervised ASR
Interspeech (Interspeech), 2023
Aakriti Agrawal
Milind Rao
Anit Kumar Sahu
Gopinath Chennupati
A. Stolcke
145
0
0
21 Jun 2023
Mixture-of-Expert Conformer for Streaming Multilingual ASR
Interspeech (Interspeech), 2023
Ke Hu
Yue Liu
Tara N. Sainath
Yu Zhang
F. Beaufays
MoE
252
24
0
25 May 2023
Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Huy Nguyen
TrungTin Nguyen
Khai Nguyen
Nhat Ho
MoE
307
19
0
12 May 2023
Demystifying Softmax Gating Function in Gaussian Mixture of Experts
Neural Information Processing Systems (NeurIPS), 2023
Huy Nguyen
TrungTin Nguyen
Nhat Ho
192
32
0
05 May 2023
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
Edoardo Ponti
MoMe
OOD
377
102
0
22 Feb 2023
ASR Bundestag: A Large-Scale political debate dataset in German
Intelligent Systems with Applications (ISA), 2023
Johannes Wirth
René Peinl
162
2
0
12 Feb 2023
Sparsity-Constrained Optimal Transport
International Conference on Learning Representations (ICLR), 2022
Tianlin Liu
J. Puigcerver
Mathieu Blondel
OT
286
27
0
30 Sep 2022
A Review of Sparse Expert Models in Deep Learning
W. Fedus
J. Dean
Barret Zoph
MoE
228
188
0
04 Sep 2022
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Zhao You
Shulin Feng
Jane Polak Scowcroft
Dong Yu
145
10
0
07 Apr 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
366
292
0
17 Feb 2022
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition
K. Kumatani
R. Gmyr
Andres Felipe Cruz Salinas
Linquan Liu
Wei Zuo
Devang Patel
Eric Sun
Yu Shi
MoE
240
22
0
10 Dec 2021
Non-asymptotic oracle inequalities for the Lasso in high-dimensional mixture of experts
TrungTin Nguyen
Hien Nguyen
Faicel Chamroukhi
Geoffrey J. McLachlan
497
4
0
22 Sep 2020
1