v1v2v3 (latest)

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora

Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024

12 February 2024

ArXiv (abs)PDF HTML Github (3662★)

Papers citing "Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora"

7 / 7 papers shown

How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations

Brandon Jaipersaud

David M. Krueger

Ekdeep Singh Lubana

163

07 Aug 2025

Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics

418

26 Feb 2025

Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches

451

03 Jan 2025

How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise in Machine Translation

Yan Meng

Di Wu

Christof Monz

477

02 Jul 2024

A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models

Peiqin Lin

Marcely Zanon Boito

Hinrich Schütze

618

29 Jun 2024

Machine Translation Models are Zero-Shot Detectors of Translation DirectionAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

671

12 Jan 2024

MC$^2$: Towards Transparent and Culturally-Aware NLP for Minority
Languages in China

^2

: Towards Transparent and Culturally-Aware NLP for Minority Languages in ChinaAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

331

14 Nov 2023