Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.00187
Cited By
Scaling Neural Machine Translation
1 June 2018
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Neural Machine Translation"
29 / 379 papers shown
Title
Joint Source-Target Self Attention with Locality Constraints
José A. R. Fonollosa
Noe Casas
Marta R. Costa-jussá
12
23
0
16 May 2019
Densifying Assumed-sparse Tensors: Improving Memory Efficiency and MPI Collective Performance during Tensor Accumulation for Parallelized Training of Neural Machine Translation Models
D. Çavdar
V. Codreanu
C. Karakuş
John A. Lockman
Damian Podareanu
...
Quy Ta
S. Varadharajan
Lucas A. Wilson
Rengan Xu
Pei Yang
20
3
0
10 May 2019
Low-Memory Neural Network Training: A Technical Report
N. Sohoni
Christopher R. Aberger
Megan Leszczynski
Jian Zhang
Christopher Ré
17
99
0
24 Apr 2019
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
64
5,490
0
21 Apr 2019
Distributed Deep Learning Strategies For Automatic Speech Recognition
Wei Zhang
Xiaodong Cui
Ulrich Finkler
Brian Kingsbury
G. Saon
David S. Kung
M. Picheny
8
29
0
10 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLM
FaML
23
3,125
0
01 Apr 2019
Pre-trained Language Model Representations for Language Generation
Sergey Edunov
Alexei Baevski
Michael Auli
14
129
0
22 Mar 2019
CVIT-MT Systems for WAT-2018
Jerin Philip
Vinay P. Namboodiri
C. V. Jawahar
11
10
0
19 Mar 2019
Cloze-driven Pretraining of Self-attention Networks
Alexei Baevski
Sergey Edunov
Yinhan Liu
Luke Zettlemoyer
Michael Auli
8
198
0
19 Mar 2019
Massively Multilingual Neural Machine Translation
Roee Aharoni
Melvin Johnson
Orhan Firat
LRM
AI4CE
17
480
0
28 Feb 2019
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
14
743
0
25 Feb 2019
Mixture Models for Diverse Machine Translation: Tricks of the Trade
T. Shen
Myle Ott
Michael Auli
MarcÁurelio Ranzato
MoE
14
148
0
20 Feb 2019
The FLoRes Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English
Francisco Guzmán
Peng-Jen Chen
Myle Ott
J. Pino
Guillaume Lample
Philipp Koehn
Vishrav Chaudhary
MarcÁurelio Ranzato
20
143
0
04 Feb 2019
An Effective Approach to Unsupervised Machine Translation
Mikel Artetxe
Gorka Labaka
Eneko Agirre
8
152
0
04 Feb 2019
Compressing Gradient Optimizers via Count-Sketches
Ryan Spring
Anastasios Kyrillidis
Vijai Mohan
Anshumali Shrivastava
9
35
0
01 Feb 2019
The Evolved Transformer
David R. So
Chen Liang
Quoc V. Le
ViT
22
460
0
30 Jan 2019
Pay Less Attention with Lightweight and Dynamic Convolutions
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
11
604
0
29 Jan 2019
Fixup Initialization: Residual Learning Without Normalization
Hongyi Zhang
Yann N. Dauphin
Tengyu Ma
ODL
AI4CE
20
347
0
27 Jan 2019
Context in Neural Machine Translation: A Review of Models and Evaluations
Andrei Popescu-Belis
MedIm
10
28
0
25 Jan 2019
An Empirical Model of Large-Batch Training
Sam McCandlish
Jared Kaplan
Dario Amodei
OpenAI Dota Team
11
267
0
14 Dec 2018
Can I trust you more? Model-Agnostic Hierarchical Explanations
Michael Tsang
Youbang Sun
Dongxu Ren
Yan Liu
FAtt
16
25
0
12 Dec 2018
Stochastic Gradient Push for Distributed Deep Learning
Mahmoud Assran
Nicolas Loizou
Nicolas Ballas
Michael G. Rabbat
16
343
0
27 Nov 2018
Neural Phrase-to-Phrase Machine Translation
Jiangtao Feng
Lingpeng Kong
Po-Sen Huang
Chong-Jun Wang
Da Huang
Jiayuan Mao
Kan Qiao
Dengyong Zhou
AIMat
16
14
0
06 Nov 2018
Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings
Mikel Artetxe
Holger Schwenk
15
196
0
03 Nov 2018
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
81
129
0
16 Oct 2018
Adaptive Input Representations for Neural Language Modeling
Alexei Baevski
Michael Auli
21
386
0
28 Sep 2018
Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation
X. Kong
Qizhe Xie
Zihang Dai
Eduard H. Hovy
11
2
0
25 Sep 2018
Large Scale Language Modeling: Converging on 40GB of Text in Four Hours
Raul Puri
Robert M. Kirby
Nikolai Yakovenko
Bryan Catanzaro
14
29
0
03 Aug 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Tal Ben-Nun
Torsten Hoefler
GNN
30
701
0
26 Feb 2018
Previous
1
2
3
4
5
6
7
8