ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.08246
  4. Cited By
Tricks for Training Sparse Translation Models

Tricks for Training Sparse Translation Models

15 October 2021
Dheeru Dua
Shruti Bhosale
Vedanuj Goswami
James Cross
M. Lewis
Angela Fan
    MoE
ArXivPDFHTML

Papers citing "Tricks for Training Sparse Translation Models"

4 / 4 papers shown
Title
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer
  with Mixture-of-View-Experts
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Wenyan Cong
Hanxue Liang
Peihao Wang
Zhiwen Fan
Tianlong Chen
M. Varma
Yi Wang
Zhangyang Wang
MoE
12
21
0
22 Aug 2023
MoEC: Mixture of Expert Clusters
MoEC: Mixture of Expert Clusters
Yuan Xie
Shaohan Huang
Tianyu Chen
Furu Wei
MoE
27
11
0
19 Jul 2022
Facebook AI WMT21 News Translation Task Submission
Facebook AI WMT21 News Translation Task Submission
C. Tran
Shruti Bhosale
James Cross
Philipp Koehn
Sergey Edunov
Angela Fan
VLM
124
80
0
06 Aug 2021
Multi-Way, Multilingual Neural Machine Translation with a Shared
  Attention Mechanism
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
Orhan Firat
Kyunghyun Cho
Yoshua Bengio
LRM
AIMat
192
622
0
06 Jan 2016
1