ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.00187
  4. Cited By
Scaling Neural Machine Translation

Scaling Neural Machine Translation

1 June 2018
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
    AIMat
ArXivPDFHTML

Papers citing "Scaling Neural Machine Translation"

50 / 379 papers shown
Title
Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
Chen Xu
Bojie Hu
Yufan Jiang
Kai Feng
Zeyang Wang
Shen Huang
Qi Ju
Tong Xiao
Jingbo Zhu
15
22
0
30 Nov 2020
Facebook AI's WMT20 News Translation Task Submission
Facebook AI's WMT20 News Translation Task Submission
Peng-Jen Chen
Ann Lee
Changhan Wang
Naman Goyal
Angela Fan
Mary Williamson
Jiatao Gu
VLM
15
37
0
16 Nov 2020
A Hybrid Approach for Improved Low Resource Neural Machine Translation
  using Monolingual Data
A Hybrid Approach for Improved Low Resource Neural Machine Translation using Monolingual Data
Idris Abdulmumin
B. Galadanci
Abubakar Isa
Habeebah Adamu Kakudi
Ismaila Idris Sinan
6
6
0
14 Nov 2020
Language Models not just for Pre-training: Fast Online Neural Noisy
  Channel Modeling
Language Models not just for Pre-training: Fast Online Neural Noisy Channel Modeling
Shruti Bhosale
Kyra Yee
Sergey Edunov
Michael Auli
50
7
0
13 Nov 2020
Analyzing Sustainability Reports Using Natural Language Processing
Analyzing Sustainability Reports Using Natural Language Processing
A. Luccioni
Emi Baylor
N. Duchêne
27
48
0
03 Nov 2020
The Volctrans Machine Translation System for WMT20
The Volctrans Machine Translation System for WMT20
Liwei Wu
Xiao Pan
Zehui Lin
Yaoming Zhu
Mingxuan Wang
Lei Li
VLM
12
17
0
28 Oct 2020
Volctrans Parallel Corpus Filtering System for WMT 2020
Volctrans Parallel Corpus Filtering System for WMT 2020
Runxin Xu
Zhuo Zhi
Jun Cao
Mingxuan Wang
Lei Li
6
4
0
27 Oct 2020
Exploiting Neural Query Translation into Cross Lingual Information
  Retrieval
Exploiting Neural Query Translation into Cross Lingual Information Retrieval
Liang Yao
Baosong Yang
Haibo Zhang
Weihua Luo
Boxing Chen
17
12
0
26 Oct 2020
Constraint Translation Candidates: A Bridge between Neural Query
  Translation and Cross-lingual Information Retrieval
Constraint Translation Candidates: A Bridge between Neural Query Translation and Cross-lingual Information Retrieval
Tianchi Bi
Liang Yao
Baosong Yang
Haibo Zhang
Weihua Luo
Boxing Chen
101
14
0
26 Oct 2020
Multi-Unit Transformers for Neural Machine Translation
Multi-Unit Transformers for Neural Machine Translation
Jianhao Yan
Fandong Meng
Jie Zhou
12
17
0
21 Oct 2020
Transition-based Parsing with Stack-Transformers
Transition-based Parsing with Stack-Transformers
Ramón Fernández Astudillo
Miguel Ballesteros
Tahira Naseem
Austin Blodgett
Radu Florian
48
71
0
20 Oct 2020
Summary-Oriented Question Generation for Informational Queries
Summary-Oriented Question Generation for Informational Queries
Xusen Yin
Li Zhou
Kevin Small
Jonathan May
13
3
0
19 Oct 2020
Revisiting Modularized Multilingual NMT to Meet Industrial Demands
Revisiting Modularized Multilingual NMT to Meet Industrial Demands
Sungwon Lyu
Bokyung Son
Kichang Yang
Jaekyoung Bae
MoE
13
20
0
19 Oct 2020
Ensemble Distillation for Structured Prediction: Calibrated, Accurate,
  Fast-Choose Three
Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast-Choose Three
Steven Reich
David Mueller
Nicholas Andrews
BDL
OOD
UQCV
17
13
0
13 Oct 2020
With Little Power Comes Great Responsibility
With Little Power Comes Great Responsibility
Dallas Card
Peter Henderson
Urvashi Khandelwal
Robin Jia
Kyle Mahowald
Dan Jurafsky
230
115
0
13 Oct 2020
Self-Paced Learning for Neural Machine Translation
Self-Paced Learning for Neural Machine Translation
Yu Wan
Baosong Yang
Derek F. Wong
Yikai Zhou
Lidia S. Chao
Haibo Zhang
Boxing Chen
59
49
0
09 Oct 2020
MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset
MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset
M. Fomicheva
Shuo Sun
E. Fonseca
Chrysoula Zerva
Frédéric Blain
Vishrav Chaudhary
Francisco Guzmán
Nina Lopatina
Lucia Specia
André F. T. Martins
19
67
0
09 Oct 2020
Shallow-to-Deep Training for Neural Machine Translation
Shallow-to-Deep Training for Neural Machine Translation
Bei Li
Ziyang Wang
Hui Liu
Yufan Jiang
Quan Du
Tong Xiao
Huizhen Wang
Jingbo Zhu
6
49
0
08 Oct 2020
TeaForN: Teacher-Forcing with N-grams
TeaForN: Teacher-Forcing with N-grams
Sebastian Goodman
Nan Ding
Radu Soricut
16
19
0
07 Oct 2020
Pre-training Multilingual Neural Machine Translation by Leveraging
  Alignment Information
Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information
Zehui Lin
Xiao Pan
Mingxuan Wang
Xipeng Qiu
Jiangtao Feng
Hao Zhou
Lei Li
10
125
0
07 Oct 2020
A Closer Look at Codistillation for Distributed Training
A Closer Look at Codistillation for Distributed Training
Shagun Sodhani
Olivier Delalleau
Mahmoud Assran
Koustuv Sinha
Nicolas Ballas
Michael G. Rabbat
19
8
0
06 Oct 2020
If beam search is the answer, what was the question?
If beam search is the answer, what was the question?
Clara Meister
Tim Vieira
Ryan Cotterell
6
138
0
06 Oct 2020
Data Rejuvenation: Exploiting Inactive Training Examples for Neural
  Machine Translation
Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation
Wenxiang Jiao
Xing Wang
Shilin He
Irwin King
Michael R. Lyu
Zhaopeng Tu
16
26
0
06 Oct 2020
How Effective is Task-Agnostic Data Augmentation for Pretrained
  Transformers?
How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
Shayne Longpre
Yu Wang
Christopher DuBois
ViT
17
83
0
05 Oct 2020
Deep Transformers with Latent Depth
Deep Transformers with Latent Depth
Xian Li
Asa Cooper Stickland
Yuqing Tang
X. Kong
19
22
0
28 Sep 2020
HetSeq: Distributed GPU Training on Heterogeneous Infrastructure
HetSeq: Distributed GPU Training on Heterogeneous Infrastructure
Yifan Ding
Nicholas Botzer
Tim Weninger
VLM
MoE
18
5
0
25 Sep 2020
An Empirical Study on Neural Keyphrase Generation
An Empirical Study on Neural Keyphrase Generation
Rui Meng
Xingdi Yuan
Tong Wang
Sanqiang Zhao
Adam Trischler
Daqing He
16
41
0
22 Sep 2020
Unsupervised Parallel Corpus Mining on Web Data
Unsupervised Parallel Corpus Mining on Web Data
Guokun Lai
Zihang Dai
Yiming Yang
9
8
0
18 Sep 2020
Autoregressive Knowledge Distillation through Imitation Learning
Autoregressive Knowledge Distillation through Imitation Learning
Alexander Lin
Jeremy Wohlwend
Howard Chen
Tao Lei
42
37
0
15 Sep 2020
Very Deep Transformers for Neural Machine Translation
Very Deep Transformers for Neural Machine Translation
Xiaodong Liu
Kevin Duh
Liyuan Liu
Jianfeng Gao
8
102
0
18 Aug 2020
The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020
The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020
Tobias Domhan
Michael J. Denkowski
David Vilar
Xing Niu
F. Hieber
Kenneth Heafield
6
52
0
11 Aug 2020
Revisiting Low Resource Status of Indian Languages in Machine
  Translation
Revisiting Low Resource Status of Indian Languages in Machine Translation
Jerin Philip
Shashank Siripragada
Vinay P. Namboodiri
C. V. Jawahar
13
26
0
11 Aug 2020
Stochastic Normalized Gradient Descent with Momentum for Large-Batch
  Training
Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training
Shen-Yi Zhao
Chang-Wei Shi
Yin-Peng Xie
Wu-Jun Li
ODL
18
8
0
28 Jul 2020
Modeling Voting for System Combination in Machine Translation
Modeling Voting for System Combination in Machine Translation
Xuancheng Huang
Jiacheng Zhang
Zhixing Tan
Derek F. Wong
Huanbo Luan
Jingfang Xu
Maosong Sun
Yang Liu
16
8
0
14 Jul 2020
scb-mt-en-th-2020: A Large English-Thai Parallel Corpus
scb-mt-en-th-2020: A Large English-Thai Parallel Corpus
Lalita Lowphansirikul
Charin Polpanumas
Attapol T. Rutherford
Sarana Nutanong
LRM
16
22
0
07 Jul 2020
TICO-19: the Translation Initiative for Covid-19
TICO-19: the Translation Initiative for Covid-19
Antonios Anastasopoulos
A. Cattelan
Zi-Yi Dou
Marcello Federico
C. Federman
...
Mengmeng Niu
A. Oktem
Eric Paquin
G. Tang
Sylwia Tur
19
88
0
03 Jul 2020
Differentiable Window for Dynamic Local Attention
Differentiable Window for Dynamic Local Attention
Thanh-Tung Nguyen
Xuan-Phi Nguyen
Shafiq R. Joty
Xiaoli Li
12
12
0
24 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of
  Gradients
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
27
2
0
21 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
8
5,547
0
20 Jun 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine
  Translation
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
Jungo Kasai
Nikolaos Pappas
Hao Peng
James Cross
Noah A. Smith
30
134
0
18 Jun 2020
Multi-branch Attentive Transformer
Multi-branch Attentive Transformer
Yang Fan
Shufang Xie
Yingce Xia
Lijun Wu
Tao Qin
Xiang-Yang Li
Tie-Yan Liu
6
17
0
18 Jun 2020
On the Computational Power of Transformers and its Implications in
  Sequence Modeling
On the Computational Power of Transformers and its Implications in Sequence Modeling
S. Bhattamishra
Arkil Patel
Navin Goyal
25
63
0
16 Jun 2020
Wat zei je? Detecting Out-of-Distribution Translations with Variational
  Transformers
Wat zei je? Detecting Out-of-Distribution Translations with Variational Transformers
Tim Z. Xiao
Aidan N. Gomez
Y. Gal
UQLM
11
33
0
08 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
58
1,646
0
08 Jun 2020
Growing Together: Modeling Human Language Learning With n-Best
  Multi-Checkpoint Machine Translation
Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
H. Cavusoglu
8
2
0
07 Jun 2020
Sponge Examples: Energy-Latency Attacks on Neural Networks
Sponge Examples: Energy-Latency Attacks on Neural Networks
Ilia Shumailov
Yiren Zhao
Daniel Bates
Nicolas Papernot
Robert D. Mullins
Ross J. Anderson
SILM
14
127
0
05 Jun 2020
MLE-guided parameter search for task loss minimization in neural
  sequence modeling
MLE-guided parameter search for task loss minimization in neural sequence modeling
Sean Welleck
Kyunghyun Cho
11
10
0
04 Jun 2020
Enhanced back-translation for low resource neural machine translation
  using self-training
Enhanced back-translation for low resource neural machine translation using self-training
Idris Abdulmumin
B. Galadanci
Abubakar Isa
SyDa
8
2
0
04 Jun 2020
Self-Training for End-to-End Speech Translation
Self-Training for End-to-End Speech Translation
J. Pino
Qiantong Xu
Xutai Ma
M. Dousti
Yun Tang
33
59
0
03 Jun 2020
Cross-model Back-translated Distillation for Unsupervised Machine
  Translation
Cross-model Back-translated Distillation for Unsupervised Machine Translation
Xuan-Phi Nguyen
Shafiq R. Joty
Thanh-Tung Nguyen
Wu Kui
A. Aw
19
14
0
03 Jun 2020
Previous
12345678
Next