Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.00187
Cited By
Scaling Neural Machine Translation
1 June 2018
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Neural Machine Translation"
50 / 379 papers shown
Title
Efficient Large Scale Language Modeling with Mixtures of Experts
Mikel Artetxe
Shruti Bhosale
Naman Goyal
Todor Mihaylov
Myle Ott
...
Jeff Wang
Luke Zettlemoyer
Mona T. Diab
Zornitsa Kozareva
Ves Stoyanov
MoE
56
188
0
20 Dec 2021
Characterizing and addressing the issue of oversmoothing in neural autoregressive sequence modeling
Ilia Kulikov
M. Eremeev
Kyunghyun Cho
16
8
0
16 Dec 2021
TransZero++: Cross Attribute-Guided Transformer for Zero-Shot Learning
Shiming Chen
Zi-Quan Hong
Wenjin Hou
Guosen Xie
Yibing Song
Jian-jun Zhao
Xinge You
Shuicheng Yan
Ling Shao
ViT
17
44
0
16 Dec 2021
TransZero: Attribute-guided Transformer for Zero-Shot Learning
Shiming Chen
Ziming Hong
Yang Liu
Guosen Xie
Baigui Sun
Hao Li
Qinmu Peng
Kelvin Lu
Xinge You
ViT
42
131
0
03 Dec 2021
Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress
Kichang Yang
KELM
VLM
29
11
0
25 Nov 2021
DBIA: Data-free Backdoor Injection Attack against Transformer Networks
Peizhuo Lv
Hualong Ma
Jiachen Zhou
Ruigang Liang
Kai Chen
Shengzhi Zhang
Yunfei Yang
24
15
0
22 Nov 2021
Combined Scaling for Zero-shot Transfer Learning
Hieu H. Pham
Zihang Dai
Golnaz Ghiasi
Kenji Kawaguchi
Hanxiao Liu
...
Yi-Ting Chen
Minh-Thang Luong
Yonghui Wu
Mingxing Tan
Quoc V. Le
VLM
4
191
0
19 Nov 2021
High Quality Rather than High Model Probability: Minimum Bayes Risk Decoding with Neural Metrics
Markus Freitag
David Grangier
Qijun Tan
Bowen Liang
22
92
0
17 Nov 2021
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
189
385
0
06 Nov 2021
Sustainable AI: Environmental Implications, Challenges and Opportunities
Carole-Jean Wu
Ramya Raghavendra
Udit Gupta
Bilge Acun
Newsha Ardalani
...
Maximilian Balandat
Joe Spisak
R. Jain
Michael G. Rabbat
K. Hazelwood
40
380
0
30 Oct 2021
PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation
Long Doan
L. T. Nguyen
Nguyen Luong Tran
T. Hoang
Dat Quoc Nguyen
33
22
0
23 Oct 2021
Simple Dialogue System with AUDITED
Eugenio Clerico
Piotr Koniusz
11
2
0
22 Oct 2021
Transformer Acceleration with Dynamic Sparse Attention
Liu Liu
Zheng Qu
Zhaodong Chen
Yufei Ding
Yuan Xie
19
20
0
21 Oct 2021
Multilingual Unsupervised Neural Machine Translation with Denoising Adapters
A. Ustun
Alexandre Berard
Laurent Besacier
Matthias Gallé
25
44
0
20 Oct 2021
Discontinuous Grammar as a Foreign Language
Daniel Fernández-González
Carlos Gómez-Rodríguez
50
9
0
20 Oct 2021
Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters
Asa Cooper Stickland
Alexandre Berard
Vassilina Nikoulina
AI4CE
11
28
0
18 Oct 2021
Differentially Private Fine-tuning of Language Models
Da Yu
Saurabh Naik
A. Backurs
Sivakanth Gopi
Huseyin A. Inan
...
Y. Lee
Andre Manoel
Lukas Wutschitz
Sergey Yekhanin
Huishuai Zhang
134
346
0
13 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Xuankai Chang
Takashi Maekaku
Pengcheng Guo
Jing Shi
Yen-Ju Lu
...
Tianzi Wang
Shu-Wen Yang
Yu Tsao
Hung-yi Lee
Shinji Watanabe
SSL
AI4TS
16
81
0
09 Oct 2021
Taming Sparsely Activated Transformer with Stochastic Experts
Simiao Zuo
Xiaodong Liu
Jian Jiao
Young Jin Kim
Hany Hassan
Ruofei Zhang
T. Zhao
Jianfeng Gao
MoE
37
108
0
08 Oct 2021
ATISS: Autoregressive Transformers for Indoor Scene Synthesis
Despoina Paschalidou
Amlan Kar
Maria Shugrina
Karsten Kreis
Andreas Geiger
Sanja Fidler
3DV
ViT
29
148
0
07 Oct 2021
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
17
269
0
06 Oct 2021
Conditional Poisson Stochastic Beam Search
Clara Meister
Afra Amini
Tim Vieira
Ryan Cotterell
24
10
0
22 Sep 2021
The NiuTrans Machine Translation Systems for WMT21
Yuhao Zhang
Tao Zhou
Bin Wei
Runzhe Cao
Yongyu Mu
...
Weiqiao Shan
Yinqiao Li
Bei Li
Tong Xiao
Jingbo Zhu
30
17
0
22 Sep 2021
ARCH: Efficient Adversarial Regularized Training with Caching
Simiao Zuo
Chen Liang
Haoming Jiang
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
T. Zhao
AAML
28
3
0
15 Sep 2021
Non-autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition
Chuan-Fei Zhang
Y. Liu
Tianren Zhang
Songlu Chen
Feng Chen
Xu-Cheng Yin
17
8
0
14 Sep 2021
SHAPE: Shifted Absolute Position Embedding for Transformers
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
Kentaro Inui
233
45
0
13 Sep 2021
AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages
Machel Reid
Junjie Hu
Graham Neubig
Y. Matsuo
68
31
0
10 Sep 2021
BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
Haoran Xu
Benjamin Van Durme
Kenton W. Murray
47
57
0
09 Sep 2021
Competence-based Curriculum Learning for Multilingual Machine Translation
Mingliang Zhang
Fandong Meng
Y. Tong
Jie Zhou
26
16
0
09 Sep 2021
What's Hidden in a One-layer Randomly Weighted Transformer?
Sheng Shen
Z. Yao
Douwe Kiela
Kurt Keutzer
Michael W. Mahoney
24
4
0
08 Sep 2021
RefineCap: Concept-Aware Refinement for Image Captioning
Yekun Chai
Shuo Jin
Junliang Xing
VLM
10
0
0
08 Sep 2021
Mixup Decoding for Diverse Machine Translation
Jicheng Li
Pengzhi Gao
Xuanfu Wu
Yang Feng
Zhongjun He
Hua-Hong Wu
Haifeng Wang
27
14
0
08 Sep 2021
An Unsupervised Method for Building Sentence Simplification Corpora in Multiple Languages
Xinyu Lu
Jipeng Qiang
Yun Li
Yunhao Yuan
Yi Zhu
22
19
0
01 Sep 2021
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
250
695
0
27 Aug 2021
Recurrent multiple shared layers in Depth for Neural Machine Translation
Guoliang Li
Yiyang Li
MoE
6
1
0
23 Aug 2021
The paradox of the compositionality of natural language: a neural machine translation case study
Verna Dankers
Elia Bruni
Dieuwke Hupkes
CoGe
162
75
0
12 Aug 2021
Video Transformer for Deepfake Detection with Incremental Learning
Sohail Ahmed Khan
Hang Dai
ViT
16
62
0
11 Aug 2021
Towards Continual Entity Learning in Language Models for Conversational Agents
R. Gadde
I. Bulyko
KELM
14
1
0
30 Jul 2021
What Do You Get When You Cross Beam Search with Nucleus Sampling?
Uri Shaham
Omer Levy
17
10
0
20 Jul 2021
On the Copying Behaviors of Pre-Training for Neural Machine Translation
Xuebo Liu
Longyue Wang
Derek F. Wong
Liang Ding
Lidia S. Chao
Shuming Shi
Zhaopeng Tu
19
25
0
17 Jul 2021
Poly-NL: Linear Complexity Non-local Layers with Polynomials
F. Babiloni
Ioannis Marras
Filippos Kokkinos
Jiankang Deng
Grigorios G. Chrysos
S. Zafeiriou
31
6
0
06 Jul 2021
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
M. Zhang
Tie-Yan Liu
41
424
0
28 Jun 2021
Language Models are Good Translators
Shuo Wang
Zhaopeng Tu
Zhixing Tan
Wenxuan Wang
Maosong Sun
Yang Liu
19
21
0
25 Jun 2021
On the Evaluation of Machine Translation for Terminology Consistency
Md Mahfuz Ibn Alam
Antonios Anastasopoulos
Laurent Besacier
James Cross
Matthias Gallé
Philipp Koehn
Vassilina Nikoulina
40
33
0
22 Jun 2021
Distributed Deep Learning in Open Collaborations
Michael Diskin
Alexey Bukhtiyarov
Max Ryabinin
Lucile Saulnier
Quentin Lhoest
...
Denis Mazur
Ilia Kobelev
Yacine Jernite
Thomas Wolf
Gennady Pekhimenko
FedML
33
54
0
18 Jun 2021
Recurrent Stacking of Layers in Neural Networks: An Application to Neural Machine Translation
Raj Dabre
Atsushi Fujita
14
1
0
18 Jun 2021
Bad Characters: Imperceptible NLP Attacks
Nicholas Boucher
Ilia Shumailov
Ross J. Anderson
Nicolas Papernot
AAML
SILM
16
103
0
18 Jun 2021
Federated Learning with Buffered Asynchronous Aggregation
John Nguyen
Kshitiz Malik
Hongyuan Zhan
Ashkan Yousefpour
Michael G. Rabbat
Mani Malek
Dzmitry Huba
FedML
30
288
0
11 Jun 2021
Input Augmentation Improves Constrained Beam Search for Neural Machine Translation: NTT at WAT 2021
Katsuki Chousa
Makoto Morishita
31
6
0
10 Jun 2021
FastSeq: Make Sequence Generation Faster
Yu Yan
Fei Hu
Jiusheng Chen
Nikhil Bhendawade
Ting Ye
Yeyun Gong
Nan Duan
Desheng Cui
Bingyu Chi
Ruifei Zhang
VLM
24
15
0
08 Jun 2021
Previous
1
2
3
4
5
6
7
8
Next