Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.08246
Cited By
Tricks for Training Sparse Translation Models
15 October 2021
Dheeru Dua
Shruti Bhosale
Vedanuj Goswami
James Cross
M. Lewis
Angela Fan
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tricks for Training Sparse Translation Models"
4 / 4 papers shown
Title
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Wenyan Cong
Hanxue Liang
Peihao Wang
Zhiwen Fan
Tianlong Chen
M. Varma
Yi Wang
Zhangyang Wang
MoE
12
21
0
22 Aug 2023
MoEC: Mixture of Expert Clusters
Yuan Xie
Shaohan Huang
Tianyu Chen
Furu Wei
MoE
27
11
0
19 Jul 2022
Facebook AI WMT21 News Translation Task Submission
C. Tran
Shruti Bhosale
James Cross
Philipp Koehn
Sergey Edunov
Angela Fan
VLM
124
80
0
06 Aug 2021
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
Orhan Firat
Kyunghyun Cho
Yoshua Bengio
LRM
AIMat
192
622
0
06 Jan 2016
1