v1v2 (latest)

Ensemble Distillation for Neural Machine Translation

6 February 2017

Papers citing "Ensemble Distillation for Neural Machine Translation"

50 / 51 papers shown

Erasure Coded Neural Network Inference via Fisher AveragingInternational Symposium on Information Theory (ISIT), 2024

Divyansh Jhunjhunwala

272

02 Sep 2024

EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

462

29 Feb 2024

Stolen Subwords: Importance of Vocabularies for Machine Translation Model Stealing

Vilém Zouhar

AAML

225

29 Jan 2024

Can a student Large Language Model perform as well as it's teacher?

Sia Gholami

Marwan Omar

269

03 Oct 2023

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization ModelsWorkshop on Biomedical Natural Language Processing (BioNLP), 2023

Potsawee Manakul

168

08 Jun 2023

Accurate Knowledge Distillation with n-best RerankingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Hendra Setiawan

501

20 May 2023

Pseudo-Label Training and Model Inertia in Neural Machine TranslationInternational Conference on Learning Representations (ICLR), 2023

277

19 May 2023

Leveraging Synthetic Targets for Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Sarthak Mittal

Oleksii Hrinchuk

Oleksii Kuchaiev

262

07 May 2023

Heterogeneous-Branch Collaborative Learning for Dialogue GenerationAAAI Conference on Artificial Intelligence (AAAI), 2023

Yiwei Li

Shaoxiong Feng

Bin Sun

Kan Li

186

21 Mar 2023

Continual Knowledge Distillation for Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Yuan Zhang

Peng Li

Maosong Sun

Yang Liu

FedML CLL

332

18 Dec 2022

Meta-Ensemble Parameter Learning

Zhengcong Fei

Junshi Huang

289

05 Oct 2022

One Reference Is Not Enough: Diverse Distillation with Reference Selection for Non-Autoregressive TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

Chenze Shao

Xuanfu Wu

Yang Feng

180

28 May 2022

Twist Decoding: Diverse Generators Guide Each OtherConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Keisuke Sakaguchi

Hao Peng

Yejin Choi

207

19 May 2022

GigaST: A 10,000-hour Pseudo Speech Translation CorpusInterspeech (Interspeech), 2022

175

08 Apr 2022

Look Backward and Forward: Self-Knowledge Distillation with Bidirectional Decoder for Neural Machine Translation

209

10 Mar 2022

Self-Distillation Mixup Training for Non-autoregressive Neural Machine Translation

Jiaxin Guo

...

208

22 Dec 2021

Amortized Noisy Channel Neural Machine Translation

Richard Yuanzhe Pang

He He

Dong Wang

255

16 Dec 2021

Multilingual AMR Parsing with Noisy Knowledge Distillation

Xin Li

217

30 Sep 2021

The NiuTrans Machine Translation Systems for WMT21

...

Jingbo Zhu

200

22 Sep 2021

Recurrent Stacking of Layers in Neural Networks: An Application to Neural Machine Translation

Mary Dabre

Atsushi Fujita

195

18 Jun 2021

Selective Knowledge Distillation for Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Fusheng Wang

Jianhao Yan

Fandong Meng

Jie Zhou

252

27 May 2021

The Volctrans Neural Speech Translation System for IWSLT 2021International Workshop on Spoken Language Translation (IWSLT), 2021

Lei Li

354

16 May 2021

Knowledge Distillation as Semiparametric InferenceInternational Conference on Learning Representations (ICLR), 2021

287

20 Apr 2021

Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A SurveyJournal of Artificial Intelligence Research (JAIR), 2021

Danielle Saunders

AI4CE

439

113

14 Apr 2021

Learning Metrics from Mean Teacher: A Supervised Learning Method for Improving the Generalization of Speaker Verification System

Ju-ho Kim

Hye-jin Shim

Jee-weon Jung

Ha-Jin Yu

234

14 Apr 2021

Sampling and Filtering of Neural Machine Translation Distillation DataNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

Vilém Zouhar

180

01 Apr 2021

Text Simplification by TaggingWorkshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2021

Kostiantyn Omelianchuk

Vipul Raheja

Oleksandr Skurzhanskyi

271

08 Mar 2021

ALP-KD: Attention-Based Layer Projection for Knowledge DistillationAAAI Conference on Artificial Intelligence (AAAI), 2020

Peyman Passban

Yimeng Wu

Mehdi Rezagholizadeh

Qun Liu

227

139

27 Dec 2020

Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep LearningInternational Conference on Learning Representations (ICLR), 2020

Zeyuan Allen-Zhu

Yuanzhi Li

FedML

752

454

17 Dec 2020

Reciprocal Supervised Learning Improves Neural Machine Translation

Lei Li

158

05 Dec 2020

Bridging the Modality Gap for Speech-to-Text Translation

259

28 Oct 2020

DiDi's Machine Translation System for WMT2020Conference on Machine Translation (WMT), 2020

Xiangang Li

196

16 Oct 2020

Why Skip If You Can Combine: A Simple Knowledge Distillation Technique for Intermediate Layers

Qun Liu

172

06 Oct 2020

Weight Distillation: Transferring the Knowledge in Neural Network ParametersAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

Jingbo Zhu

354

19 Sep 2020

Compression of Deep Learning Models for Text: A SurveyACM Transactions on Knowledge Discovery from Data (TKDD), 2020

Manish Gupta

Puneet Agrawal

VLM MedIm AI4CE

696

142

12 Aug 2020

Knowledge Distillation: A Survey

2.2K

4,072

09 Jun 2020

An Overview of Neural Network Compression

James OÑeill

AI4CE

442

119

05 Jun 2020

Cross-model Back-translated Distillation for Unsupervised Machine TranslationInternational Conference on Machine Learning (ICML), 2020

271

03 Jun 2020

Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech Recognition

294

19 May 2020

Building a Multi-domain Neural Machine Translation Model using Knowledge DistillationEuropean Conference on Artificial Intelligence (ECAI), 2020

Idriss Mghabbar

Pirashanth Ratnamogan

170

15 Apr 2020

Balancing Cost and Benefit with Tied-Multi TransformersWorkshop on Neural Generation and Translation (WNGT), 2020

Mary Dabre

Raphaël Rubino

Atsushi Fujita

184

20 Feb 2020

Neural Machine Translation: A Review and SurveyJournal of Artificial Intelligence Research (JAIR), 2019

Felix Stahlberg

3DV AI4TS MedIm

486

404

04 Dec 2019

Data Diversification: A Simple Strategy For Neural Machine Translation

506

05 Nov 2019

Multi-agent Learning for Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

151

03 Sep 2019

Multi-Layer Softmaxing during Training Neural Machine Translation for Flexible Decoding with Fewer Layers

Mary Dabre

Atsushi Fujita

AI4CE

152

27 Aug 2019

End-to-End Speech Translation with Knowledge Distillation

321

170

17 Apr 2019

Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion

Xu Tan

259

06 Apr 2019

Multilingual Neural Machine Translation with Knowledge DistillationInternational Conference on Learning Representations (ICLR), 2019

Xu Tan

Zhou Zhao

359

263

27 Feb 2019

Distilling Knowledge for Search-based Structured Prediction

167

29 May 2018

A Stable and Effective Learning Strategy for Trainable Greedy DecodingConference on Empirical Methods in Natural Language Processing (EMNLP), 2018

319

21 Apr 2018