Direct Neural Machine Translation with Task-level Mixture of Experts models

18 October 2023

Papers citing "Direct Neural Machine Translation with Task-level Mixture of Experts models"

6 / 6 papers shown

Title
Tutel: Adaptive Mixture-of-Experts at Scale Changho Hwang Wei Cui Yifan Xiong Ziyue Yang Ze Liu ... Joe Chau Peng Cheng Fan Yang Mao Yang Y. Xiong MoE 92 108 0 07 Jun 2022
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT James Lee-Thorp Joshua Ainslie MoE 30 11 0 24 May 2022
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference Sneha Kudugunta Yanping Huang Ankur Bapna M. Krikun Dmitry Lepikhin Minh-Thang Luong Orhan Firat MoE 119 104 0 24 Sep 2021
Improving Multilingual Translation by Representation and Gradient Regularization Yilin Yang Akiko Eriguchi Alexandre Muzio Prasad Tadepalli Stefan Lee Hany Hassan 47 40 0 10 Sep 2021
Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages Yunsu Kim P. Petrov Pavel Petrushkov Shahram Khadivi Hermann Ney LRM 42 79 0 20 Sep 2019
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism Orhan Firat Kyunghyun Cho Yoshua Bengio LRM AIMat 206 622 0 06 Jan 2016