Parallel Corpus Filtering via Pre-trained Language Models

13 May 2020

Papers citing "Parallel Corpus Filtering via Pre-trained Language Models"

17 / 17 papers shown

A kinetic-based regularization method for data science applications

464

06 Mar 2025

Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics

416

26 Feb 2025

Positive Text Reframing under Multi-strategy Optimization

Bo Liu

336

25 Jul 2024

Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning

447

29 May 2024

Adversarial Fine-Tuning of Language Models: An Iterative Optimisation Approach for the Generation and Detection of Problematic Content

Jack Miller

244

26 Aug 2023

Discovering Language Model Behaviors with Model-Written EvaluationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

...

Deep Ganguli

440

692

19 Dec 2022

High-Resource Methodological Bias in Low-Resource Investigations

Maartje ter Hoeve

David Grangier

Natalie Schluter

296

14 Nov 2022

Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

398

10 Mar 2022

Empirical Analysis of Korean Public AI Hub Parallel Corpora and in-depth Analysis using LIWC

148

28 Oct 2021

Neural Machine Translation for Low-Resource Languages: A SurveyACM Computing Surveys (CSUR), 2021

Surangika Ranathunga

E. Lee

Marjana Prifti Skenduli

Ravi Shekhar

Mehreen Alam

Rishemjit Kaur

383

346

29 Jun 2021

Don't Rule Out Monolingual Speakers: A Method For Crowdsourcing Machine Translation DataAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Rajat Bhatnagar

Ananya Ganesh

Katharina Kann

187

12 Jun 2021

Prevent the Language Model from being Overconfident in Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Mengqi Miao

Fandong Meng

Yijin Liu

Xiao-Hua Zhou

Jie Zhou

406

24 May 2021

The Curious Case of Hallucinations in Neural Machine TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

Vikas Raunak

Arul Menezes

Marcin Junczys-Dowmunt

640

232

14 Apr 2021

Assessing Reference-Free Peer Evaluation for Machine TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

Colin Cherry

211

12 Apr 2021

Score Combination for Improved Parallel Corpus Filtering for Low Resource ConditionsConference on Machine Translation (WMT), 2020

Muhammad N. ElNokrashy

177

16 Nov 2020

Detecting Hallucinated Content in Conditional Neural Sequence Generation

Graham Neubig

Luke Zettlemoyer

518

200

05 Nov 2020

DiDi's Machine Translation System for WMT2020Conference on Machine Translation (WMT), 2020

Xiangang Li

178

16 Oct 2020