Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.05791
Cited By
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia
10 July 2019
Holger Schwenk
Vishrav Chaudhary
Shuo Sun
Hongyu Gong
Francisco Guzmán
CVBM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia"
6 / 56 papers shown
Title
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
Tahmid Hasan
Abhik Bhattacharjee
Kazi Samin Mubasshir
Masum Hasan
Madhusudan Basak
M. Rahman
Rifat Shahriyar
VLM
15
72
0
20 Sep 2020
A Multilingual Parallel Corpora Collection Effort for Indian Languages
Shashank Siripragrada
Jerin Philip
Vinay P. Namboodiri
C. V. Jawahar
VLM
13
47
0
15 Jul 2020
SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings
Masoud Jalili Sabet
Philipp Dufter
François Yvon
Hinrich Schütze
4
224
0
18 Apr 2020
Translation Artifacts in Cross-lingual Transfer Learning
Mikel Artetxe
Gorka Labaka
Eneko Agirre
6
114
0
09 Apr 2020
JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus
Makoto Morishita
Jun Suzuki
Masaaki Nagata
LRM
30
64
0
25 Nov 2019
MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible
Marcely Zanon Boito
William N. Havard
Mahault Garnerin
Éric Le Ferrand
Laurent Besacier
20
46
0
30 Jul 2019
Previous
1
2