Scaling Neural Machine Translation

1 June 2018

Sergey Edunov

Papers citing "Scaling Neural Machine Translation"

50 / 379 papers shown

Title
JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus Makoto Morishita Jun Suzuki Masaaki Nagata LRM 30 64 0 25 Nov 2019
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering Akari Asai Kazuma Hashimoto Hannaneh Hajishirzi R. Socher Caiming Xiong RALM KELM LRM 15 282 0 24 Nov 2019
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning Guangxiang Zhao Xu Sun Jingjing Xu Zhiyuan Zhang Liangchen Luo LRM 14 49 0 17 Nov 2019
What do you mean, BERT? Assessing BERT as a Distributional Semantics Model Timothee Mickus Denis Paperno Mathieu Constant Kees van Deemter 21 45 0 13 Nov 2019
Mark my Word: A Sequence-to-Sequence Approach to Definition Modeling Timothee Mickus Denis Paperno Mathieu Constant 16 29 0 13 Nov 2019
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB Holger Schwenk Guillaume Wenzek Sergey Edunov Edouard Grave Armand Joulin 25 254 0 10 Nov 2019
Effectiveness of self-supervised pre-training for speech recognition Alexei Baevski Michael Auli Abdel-rahman Mohamed SSL 19 147 0 10 Nov 2019
Two-Headed Monster And Crossed Co-Attention Networks Yaoyiran Li Jing Jiang 19 0 0 10 Nov 2019
Improving Transformer Models by Reordering their Sublayers Ofir Press Noah A. Smith Omer Levy 11 87 0 10 Nov 2019
Distilling Knowledge Learned in BERT for Text Generation Yen-Chun Chen Zhe Gan Yu Cheng Jingzhou Liu Jingjing Liu 15 28 0 10 Nov 2019
Ask to Learn: A Study on Curiosity-driven Question Generation Thomas Scialom Jacopo Staiano 25 24 0 08 Nov 2019
Data Diversification: A Simple Strategy For Neural Machine Translation Xuan-Phi Nguyen Shafiq R. Joty Wu Kui A. Aw 14 15 0 05 Nov 2019
Machine Translation of Restaurant Reviews: New Corpus for Domain Adaptation and Robustness Alexandre Berard Ioan Calapodescu Marc Dymetman Claude Roux Jean-Luc Meunier Vassilina Nikoulina 9 27 0 31 Oct 2019
Naver Labs Europe's Systems for the Document-Level Generation and Translation Task at WNGT 2019 Fahimeh Saleh Alexandre Berard Ioan Calapodescu Laurent Besacier VLM 15 14 0 31 Oct 2019
Adapting Multilingual Neural Machine Translation to Unseen Languages Surafel Melaku Lakew Alina Karakanta Marcello Federico Matteo Negri Marco Turchi 31 20 0 30 Oct 2019
Controlling the Output Length of Neural Machine Translation Surafel Melaku Lakew Mattia Antonino Di Gangi Marcello Federico 15 67 0 23 Oct 2019
Robust Neural Machine Translation for Clean and Noisy Speech Transcripts Mattia Antonino Di Gangi Robert Enyedi A. Brusadin Marcello Federico 21 25 0 22 Oct 2019
Fully Quantized Transformer for Machine Translation Gabriele Prato Ella Charlaix Mehdi Rezagholizadeh MQ 13 68 0 17 Oct 2019
Transformers without Tears: Improving the Normalization of Self-Attention Toan Q. Nguyen Julian Salazar 36 224 0 14 Oct 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations Alexei Baevski Steffen Schneider Michael Auli SSL 11 660 0 12 Oct 2019
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum Jianyu Wang Vinayak Tantia Nicolas Ballas Michael G. Rabbat 4 200 0 01 Oct 2019
UNITER: UNiversal Image-TExt Representation Learning Yen-Chun Chen Linjie Li Licheng Yu Ahmed El Kholy Faisal Ahmed Zhe Gan Yu Cheng Jingjing Liu VLM OT 29 444 0 25 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout Angela Fan Edouard Grave Armand Joulin 22 584 0 25 Sep 2019
Improved Variational Neural Machine Translation by Promoting Mutual Information Arya D. McCarthy Xian Li Jiatao Gu Ning Dong DRL 22 7 0 19 Sep 2019
A Comparative Study on Transformer vs RNN in Speech Applications Shigeki Karita Nanxin Chen Tomoki Hayashi Takaaki Hori H. Inaguma ... Ryuichi Yamamoto Xiao-fei Wang Shinji Watanabe Takenori Yoshimura Wangyou Zhang 23 716 0 13 Sep 2019
Hybrid Data-Model Parallel Training for Sequence-to-Sequence Recurrent Neural Network Machine Translation Junya Ono Masao Utiyama Eiichiro Sumita AIMat AI4CE 11 7 0 02 Sep 2019
Improving Multi-Head Attention with Capsule Networks Shuhao Gu Yang Feng 12 12 0 31 Aug 2019
Scale Calibrated Training: Improving Generalization of Deep Networks via Scale-Specific Normalization Zhuoran Yu Aojun Zhou Yukun Ma Yudian Li Xiaohan Zhang Ping Luo 16 3 0 31 Aug 2019
Adaptively Sparse Transformers Gonçalo M. Correia Vlad Niculae André F. T. Martins 8 252 0 30 Aug 2019
Improving Deep Transformer with Depth-Scaled Initialization and Merged Attention Biao Zhang Ivan Titov Rico Sennrich 6 101 0 29 Aug 2019
Simple and Effective Noisy Channel Modeling for Neural Machine Translation Kyra Yee Nathan Ng Yann N. Dauphin Michael Auli 12 79 0 15 Aug 2019
Towards Knowledge-Based Recommender Dialog System Qibin Chen Junyang Lin Yichang Zhang Ming Ding Yukuo Cen Hongxia Yang Jie Tang 19 237 0 15 Aug 2019
On The Evaluation of Machine Translation Systems Trained With Back-Translation Sergey Edunov Myle Ott MarcÁurelio Ranzato Michael Auli 9 96 0 14 Aug 2019
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training Saptadeep Pal Eiman Ebrahimi A. Zulfiqar Yaosheng Fu Victor Zhang Szymon Migacz D. Nellans Puneet Gupta 31 55 0 30 Jul 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 121 23,865 0 26 Jul 2019
DropAttention: A Regularization Method for Fully-Connected Self-Attention Networks Zehui Lin Pengfei Liu Luyao Huang Junkun Chen Xipeng Qiu Xuanjing Huang 3DPC 16 44 0 25 Jul 2019
ELI5: Long Form Question Answering Angela Fan Yacine Jernite Ethan Perez David Grangier Jason Weston Michael Auli AI4MH ELM 17 592 0 22 Jul 2019
Facebook FAIR's WMT19 News Translation Task Submission Nathan Ng Kyra Yee Alexei Baevski Myle Ott Michael Auli Sergey Edunov VLM 6 393 0 15 Jul 2019
Naver Labs Europe's Systems for the WMT19 Machine Translation Robustness Task Alexandre Berard Ioan Calapodescu Claude Roux VLM 4 59 0 15 Jul 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges N. Arivazhagan Ankur Bapna Orhan Firat Dmitry Lepikhin Melvin Johnson ... George F. Foster Colin Cherry Wolfgang Macherey Z. Chen Yonghui Wu 23 422 0 11 Jul 2019
A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition Wei Zhang Xiaodong Cui Ulrich Finkler G. Saon Abdullah Kayi A. Buyuktosunoglu Brian Kingsbury David S. Kung M. Picheny 18 19 0 10 Jul 2019
NTT's Machine Translation Systems for WMT19 Robustness Task Soichiro Murakami Makoto Morishita Tsutomu Hirao Masaaki Nagata VLM 10 9 0 09 Jul 2019
Improving Robustness in Real-World Neural Machine Translation Engines Rohit Gupta Patrik Lambert Raj Nath Patel J. Tinsley 29 4 0 02 Jul 2019
Making Asynchronous Stochastic Gradient Descent Work for Transformers Alham Fikri Aji Kenneth Heafield 19 13 0 08 Jun 2019
Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP Haonan Yu Sergey Edunov Yuandong Tian Ari S. Morcos 16 148 0 06 Jun 2019
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View Yiping Lu Zhuohan Li Di He Zhiqing Sun Bin Dong Tao Qin Liwei Wang Tie-Yan Liu AI4CE 13 168 0 06 Jun 2019
Learning Deep Transformer Models for Machine Translation Qiang Wang Bei Li Tong Xiao Jingbo Zhu Changliang Li Derek F. Wong Lidia S. Chao 14 656 0 05 Jun 2019
Evaluating Gender Bias in Machine Translation Gabriel Stanovsky Noah A. Smith Luke Zettlemoyer 11 393 0 03 Jun 2019
Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks Boris Ginsburg P. Castonguay Oleksii Hrinchuk Oleksii Kuchaiev Vitaly Lavrukhin Ryan Leary Jason Chun Lok Li Huyen Nguyen Yang Zhang Jonathan M. Cohen ODL 12 13 0 27 May 2019
Are Sixteen Heads Really Better than One? Paul Michel Omer Levy Graham Neubig MoE 13 1,035 0 25 May 2019