SpeechMoE2: Mixture-of-Experts Model with Improved Routing

SpeechMoE2: Mixture-of-Experts Model with Improved Routing

    MoE

Papers citing "SpeechMoE2: Mixture-of-Experts Model with Improved Routing"

22 / 22 papers shown
Title
A General Theory for Softmax Gating Multinomial Logistic Mixture of
  Experts
A General Theory for Softmax Gating Multinomial Logistic Mixture of ExpertsInternational Conference on Machine Learning (ICML), 2023
225
17
0
22 Oct 2023
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of
  Experts
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of ExpertsInternational Conference on Learning Representations (ICLR), 2023
307
23
0
25 Sep 2023
Towards Convergence Rates for Parameter Estimation in Gaussian-gated
  Mixture of Experts
Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of ExpertsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
319
19
0
12 May 2023
Demystifying Softmax Gating Function in Gaussian Mixture of Experts
Demystifying Softmax Gating Function in Gaussian Mixture of ExpertsNeural Information Processing Systems (NeurIPS), 2023
212
32
0
05 May 2023
ASR Bundestag: A Large-Scale political debate dataset in German
ASR Bundestag: A Large-Scale political debate dataset in GermanIntelligent Systems with Applications (ISA), 2023
178
2
0
12 Feb 2023
Sparsity-Constrained Optimal Transport
Sparsity-Constrained Optimal TransportInternational Conference on Learning Representations (ICLR), 2022
286
28
0
30 Sep 2022
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech
  recognition
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognitionInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
145
10
0
07 Apr 2022