ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.12282
  4. Cited By
On the Impact of Various Types of Noise on Neural Machine Translation

On the Impact of Various Types of Noise on Neural Machine Translation

31 May 2018
Huda Khayrallah
Philipp Koehn
    AAML
ArXivPDFHTML

Papers citing "On the Impact of Various Types of Noise on Neural Machine Translation"

35 / 35 papers shown
Title
MultiOCR-QA: Dataset for Evaluating Robustness of LLMs in Question Answering on Multilingual OCR Texts
MultiOCR-QA: Dataset for Evaluating Robustness of LLMs in Question Answering on Multilingual OCR Texts
Bhawna Piryani
Jamshid Mozafari
Abdelrahman Abdallah
Antoine Doucet
Adam Jatowt
47
1
0
24 Feb 2025
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
34
2
0
12 Oct 2024
Cogs in a Machine, Doing What They're Meant to Do -- The AMI Submission
  to the WMT24 General Translation Task
Cogs in a Machine, Doing What They're Meant to Do -- The AMI Submission to the WMT24 General Translation Task
Atli Jasonarson
Hinrik Hafsteinsson
Bjarki Ármannsson
Steinþór Steingrímsson
SyDa
32
2
0
04 Oct 2024
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
Haoran Xu
Kenton W. Murray
Philipp Koehn
Hieu T. Hoang
Akiko Eriguchi
Huda Khayrallah
29
7
0
04 Oct 2024
How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise in Machine Translation
How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise in Machine Translation
Yan Meng
Di Wu
Christof Monz
28
1
0
02 Jul 2024
Critical Learning Periods: Leveraging Early Training Dynamics for
  Efficient Data Pruning
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
39
4
0
29 May 2024
Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data
  Annotation
Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation
Juhwan Choi
Jungmin Yun
Kyohoon Jin
Youngbin Kim
32
4
0
15 Apr 2024
Adaptative Bilingual Aligning Using Multilingual Sentence Embedding
Adaptative Bilingual Aligning Using Multilingual Sentence Embedding
Olivier Kraif
20
0
0
18 Mar 2024
SentAlign: Accurate and Scalable Sentence Alignment
SentAlign: Accurate and Scalable Sentence Alignment
Steinþór Steingrímsson
H. Loftsson
Andy Way
20
7
0
15 Nov 2023
Separating the Wheat from the Chaff with BREAD: An open-source benchmark
  and metrics to detect redundancy in text
Separating the Wheat from the Chaff with BREAD: An open-source benchmark and metrics to detect redundancy in text
Isaac Caswell
Lisa Wang
Isabel Papadimitriou
26
0
0
11 Nov 2023
There's no Data Like Better Data: Using QE Metrics for MT Data Filtering
There's no Data Like Better Data: Using QE Metrics for MT Data Filtering
Jan-Thorsten Peter
David Vilar
Daniel Deutsch
Mara Finkelstein
Juraj Juraska
Markus Freitag
9
16
0
09 Nov 2023
An Investigation of Noise in Morphological Inflection
An Investigation of Noise in Morphological Inflection
Adam Wiemerslage
Changbing Yang
Garrett Nicolai
Miikka Silfverberg
Katharina Kann
25
2
0
26 May 2023
Out-of-Distribution Generalization in Text Classification: Past,
  Present, and Future
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future
Linyi Yang
Y. Song
Xuan Ren
Chenyang Lyu
Yidong Wang
Lingqiao Liu
Jindong Wang
Jennifer Foster
Yue Zhang
OOD
34
2
0
23 May 2023
Dissociating language and thought in large language models
Dissociating language and thought in large language models
Kyle Mahowald
Anna A. Ivanova
I. Blank
Nancy Kanwisher
J. Tenenbaum
Evelina Fedorenko
ELM
ReLM
29
209
0
16 Jan 2023
Competency-Aware Neural Machine Translation: Can Machine Translation
  Know its Own Translation Quality?
Competency-Aware Neural Machine Translation: Can Machine Translation Know its Own Translation Quality?
Pei Zhang
Baosong Yang
Hao-Ran Wei
Dayiheng Liu
Kai Fan
Luo Si
Jun Xie
21
3
0
25 Nov 2022
Separating Grains from the Chaff: Using Data Filtering to Improve
  Multilingual Translation for Low-Resourced African Languages
Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages
Idris Abdulmumin
Michael Beukman
Jesujoba Oluwadara Alabi
Chris C. Emezue
Everlyn Asiko
...
Shamsuddeen Hassan Muhammad
Mofetoluwa Adeyemi
Oreen Yousuf
Sahib Singh
T. Gwadabe
31
6
0
19 Oct 2022
CipherDAug: Ciphertext based Data Augmentation for Neural Machine
  Translation
CipherDAug: Ciphertext based Data Augmentation for Neural Machine Translation
Nishant Kambhatla
Logan Born
Anoop Sarkar
13
16
0
01 Apr 2022
Scientometric Review of Artificial Intelligence for Operations &
  Maintenance of Wind Turbines: The Past, Present and Future
Scientometric Review of Artificial Intelligence for Operations & Maintenance of Wind Turbines: The Past, Present and Future
Joyjit Chatterjee
Nina Dethlefs
26
83
0
30 Mar 2022
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for
  Low-Resource Language Translation?
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?
E. Lee
Sarubi Thillainathan
Shravan Nayak
Surangika Ranathunga
David Ifeoluwa Adelani
Ruisi Su
Arya D. McCarthy
VLM
21
43
0
16 Mar 2022
Noisy UGC Translation at the Character Level: Revisiting Open-Vocabulary
  Capabilities and Robustness of Char-Based Models
Noisy UGC Translation at the Character Level: Revisiting Open-Vocabulary Capabilities and Robustness of Char-Based Models
José Carlos Rosales Núnez
Guolong Su
Djamé Seddah
17
8
0
24 Oct 2021
Improving Arabic Diacritization by Learning to Diacritize and Translate
Improving Arabic Diacritization by Learning to Diacritize and Translate
Brian Thompson
A. Alshehri
32
10
0
29 Sep 2021
Rethinking Data Augmentation for Low-Resource Neural Machine
  Translation: A Multi-Task Learning Approach
Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach
Víctor M. Sánchez-Cartagena
M. Esplà-Gomis
Juan Antonio Pérez-Ortiz
Felipe Sánchez-Martínez
42
27
0
08 Sep 2021
Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural
  Machine Translation
Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
Bryan Eikema
Wilker Aziz
26
45
0
10 Aug 2021
Robust Embeddings Via Distributions
Robust Embeddings Via Distributions
Kira A. Selby
Yinong Wang
Ruizhe Wang
Peyman Passban
Ahmad Rashid
Mehdi Rezagholizadeh
Pascal Poupart
OOD
27
3
0
17 Apr 2021
Domain Adaptation and Multi-Domain Adaptation for Neural Machine
  Translation: A Survey
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
Danielle Saunders
AI4CE
17
85
0
14 Apr 2021
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New
  Datasets for Bengali-English Machine Translation
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
Tahmid Hasan
Abhik Bhattacharjee
Kazi Samin Mubasshir
Masum Hasan
Madhusudan Basak
M. Rahman
Rifat Shahriyar
VLM
15
72
0
20 Sep 2020
Data Weighted Training Strategies for Grammatical Error Correction
Data Weighted Training Strategies for Grammatical Error Correction
Jared Lichtarge
Chris Alberti
Shankar Kumar
15
46
0
07 Aug 2020
The Unreasonable Volatility of Neural Machine Translation Models
The Unreasonable Volatility of Neural Machine Translation Models
Marzieh Fadaee
Christof Monz
20
16
0
25 May 2020
Translationese as a Language in "Multilingual" NMT
Translationese as a Language in "Multilingual" NMT
Parker Riley
Isaac Caswell
Markus Freitag
David Grangier
24
42
0
10 Nov 2019
Low-Resource Corpus Filtering using Multilingual Sentence Embeddings
Low-Resource Corpus Filtering using Multilingual Sentence Embeddings
Vishrav Chaudhary
Y. Tang
Francisco Guzmán
Holger Schwenk
Philipp Koehn
18
77
0
20 Jun 2019
Dynamically Composing Domain-Data Selection with Clean-Data Selection by
  "Co-Curricular Learning" for Neural Machine Translation
Dynamically Composing Domain-Data Selection with Clean-Data Selection by "Co-Curricular Learning" for Neural Machine Translation
Wei Wang
Isaac Caswell
Ciprian Chelba
26
57
0
03 Jun 2019
Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data
  In Your Machine Translation System?
Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System?
Sorami Hisamoto
Matt Post
Kevin Duh
MIACV
SLR
28
106
0
11 Apr 2019
Improving Robustness of Machine Translation with Synthetic Noise
Improving Robustness of Machine Translation with Synthetic Noise
Vaibhav
Sumeet Singh
Craig Alan Stewart
Graham Neubig
16
82
0
25 Feb 2019
Back-Translation Sampling by Targeting Difficult Words in Neural Machine
  Translation
Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation
Marzieh Fadaee
Christof Monz
14
98
0
27 Aug 2018
Six Challenges for Neural Machine Translation
Six Challenges for Neural Machine Translation
Philipp Koehn
Rebecca Knowles
AAML
AIMat
218
1,208
0
12 Jun 2017
1