SpeechMoE2: Mixture-of-Experts Model with Improved Routing

23 November 2021

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "SpeechMoE2: Mixture-of-Experts Model with Improved Routing"

22 / 22 papers shown

Title
Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps Do Tien Hai T. T. N. Mai T. Nguyen Nhat Ho Binh T. Nguyen Christopher Drovandi 120 0 0 14 Oct 2025
Omni-Router: Sharing Routing Decisions in Sparse Mixture-of-Experts for Speech Recognition Zijin Gu Tatiana Likhomanenko Navdeep Jaitly MoE 184 2 0 08 Jul 2025
LUPET: Incorporating Hierarchical Information Path into Multilingual ASRInterspeech (Interspeech), 2024 Wei Liu Jingyong Hou Dong Yang Muyong Cao Tan Lee 450 2 0 10 Jan 2025
$$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts$ $\texttt{MoE-RBench}$ : Towards Building Reliable Language Models with Sparse Mixture-of-Experts Guanjie Chen Xinyu Zhao Tianlong Chen Yu Cheng MoE 248 6 0 17 Jun 2024
On Parameter Estimation in Deviated Gaussian Mixture of Experts Huy Nguyen Khai Nguyen Nhat Ho 186 1 0 07 Feb 2024
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape Timothy R. McIntosh Teo Susnjak Tong Liu Paul Watters Malka N. Halgamuge 355 73 0 18 Dec 2023
A General Theory for Softmax Gating Multinomial Logistic Mixture of ExpertsInternational Conference on Machine Learning (ICML), 2023 Huy Nguyen Pedram Akbarian TrungTin Nguyen Nhat Ho 217 17 0 22 Oct 2023
Direct Neural Machine Translation with Task-level Mixture of Experts models Isidora Chara Tourni Subhajit Naskar MoE 186 0 0 18 Oct 2023
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of ExpertsInternational Conference on Learning Representations (ICLR), 2023 Huy Nguyen Pedram Akbarian Fanqi Yan Nhat Ho MoE 303 23 0 25 Sep 2023
LLMCad: Fast and Scalable On-device Large Language Model Inference Daliang Xu Wangsong Yin Xin Jin Yanzhe Zhang Shiyun Wei Mengwei Xu Xuanzhe Liu 165 67 0 08 Sep 2023
Learning When to Trust Which Teacher for Weakly Supervised ASRInterspeech (Interspeech), 2023 Aakriti Agrawal Milind Rao Anit Kumar Sahu Gopinath Chennupati A. Stolcke 145 0 0 21 Jun 2023
Mixture-of-Expert Conformer for Streaming Multilingual ASRInterspeech (Interspeech), 2023 Ke Hu Yue Liu Tara N. Sainath Yu Zhang F. Beaufays MoE 252 24 0 25 May 2023
Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of ExpertsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023 Huy Nguyen TrungTin Nguyen Khai Nguyen Nhat Ho MoE 307 19 0 12 May 2023
Demystifying Softmax Gating Function in Gaussian Mixture of ExpertsNeural Information Processing Systems (NeurIPS), 2023 Huy Nguyen TrungTin Nguyen Nhat Ho 192 32 0 05 May 2023
Modular Deep Learning Jonas Pfeiffer Sebastian Ruder Ivan Vulić Edoardo Ponti MoMe OOD 377 102 0 22 Feb 2023
ASR Bundestag: A Large-Scale political debate dataset in GermanIntelligent Systems with Applications (ISA), 2023 Johannes Wirth René Peinl 162 2 0 12 Feb 2023
Sparsity-Constrained Optimal TransportInternational Conference on Learning Representations (ICLR), 2022 Tianlin Liu J. Puigcerver Mathieu Blondel OT 286 27 0 30 Sep 2022
A Review of Sparse Expert Models in Deep Learning W. Fedus J. Dean Barret Zoph MoE 228 188 0 04 Sep 2022
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognitionInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022 Zhao You Shulin Feng Jane Polak Scowcroft Dong Yu 145 10 0 07 Apr 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models Barret Zoph Irwan Bello Sameer Kumar Nan Du Yanping Huang J. Dean Noam M. Shazeer W. Fedus MoE 366 292 0 17 Feb 2022
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition K. Kumatani R. Gmyr Andres Felipe Cruz Salinas Linquan Liu Wei Zuo Devang Patel Eric Sun Yu Shi MoE 240 22 0 10 Dec 2021
Non-asymptotic oracle inequalities for the Lasso in high-dimensional mixture of experts TrungTin Nguyen Hien Nguyen Faicel Chamroukhi Geoffrey J. McLachlan 497 4 0 22 Sep 2020