Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.01887
Cited By
PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining
4 August 2021
Machel Reid
Mikel Artetxe
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining"
22 / 22 papers shown
Title
Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation
Sarubi Thillainathan
Songchen Yuan
E. Lee
Sanath Jayasena
Surangika Ranathunga
35
0
0
28 Mar 2025
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard
Hippolyte Gisserot-Boukhlef
Duarte M. Alves
André F. T. Martins
Ayoub Hammal
...
Maxime Peyrard
Nuno M. Guerreiro
Patrick Fernandes
Ricardo Rei
Pierre Colombo
82
1
0
07 Mar 2025
How Transliterations Improve Crosslingual Alignment
Yihong Liu
Mingyang Wang
Amir Hossein Kargaran
Ayyoob Imani
Orgest Xhelili
Haotian Ye
Chunlan Ma
François Yvon
Hinrich Schütze
31
2
0
25 Sep 2024
A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
Peiqin Lin
André F. T. Martins
Hinrich Schütze
51
2
0
29 Jun 2024
Smart Bilingual Focused Crawling of Parallel Documents
Cristian García-Romero
Miquel Espla-Gomis
Felipe Sánchez-Martínez
19
0
0
23 May 2024
The Role of Language Imbalance in Cross-lingual Generalisation: Insights from Cloned Language Experiments
Anton Schäfer
Shauli Ravfogel
Thomas Hofmann
Tiago Pimentel
Imanol Schlag
55
3
0
11 Apr 2024
LEIA: Facilitating Cross-lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
Ikuya Yamada
Ryokan Ri
KELM
15
0
0
18 Feb 2024
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
Terra Blevins
Tomasz Limisiewicz
Suchin Gururangan
Margaret Li
Hila Gonen
Noah A. Smith
Luke Zettlemoyer
42
22
0
19 Jan 2024
Code-Switching with Word Senses for Pretraining in Neural Machine Translation
Vivek Iyer
Edoardo Barba
Alexandra Birch
Jeff Z. Pan
Roberto Navigli
18
3
0
21 Oct 2023
Leveraging Auxiliary Domain Parallel Data in Intermediate Task Fine-tuning for Low-resource Translation
Shravan Nayak
Surangika Ranathunga
Sarubi Thillainathan
Rikki Hung
Anthony Rinaldi
Yining Wang
Jonah Mackey
Andrew Ho
E. Lee
15
3
0
02 Jun 2023
Cross-Lingual Supervision improves Large Language Models Pre-training
Andrea Schioppa
Xavier Garcia
Orhan Firat
LRM
10
12
0
19 May 2023
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks
Anas Himmi
Ekhine Irurozki
Nathan Noiry
Stéphan Clémençon
Pierre Colombo
19
5
0
17 May 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Mutian He
Philip N. Garner
30
4
0
16 May 2023
Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Alex Jones
Isaac Caswell
Ishan Saxena
Orhan Firat
21
8
0
27 Mar 2023
On the Role of Parallel Data in Cross-lingual Transfer Learning
Machel Reid
Mikel Artetxe
21
10
0
20 Dec 2022
Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models
Hongyuan Lu
Haoyang Huang
Shuming Ma
Dongdong Zhang
W. Lam
Furu Wei
19
4
0
15 Dec 2022
Revamping Multilingual Agreement Bidirectionally via Switched Back-translation for Multilingual Neural Machine Translation
Hongyuan Lu
Haoyang Huang
Shuming Ma
Dongdong Zhang
Furu Wei
Wai Lam
19
0
0
28 Sep 2022
Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation
Xinyi Wang
Sebastian Ruder
Graham Neubig
13
60
0
17 Mar 2022
AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages
Machel Reid
Junjie Hu
Graham Neubig
Y. Matsuo
51
31
0
10 Sep 2021
MT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
Zewen Chi
Li Dong
Shuming Ma
Shaohan Huang Xian-Ling Mao
Heyan Huang
Furu Wei
LRM
45
71
0
18 Apr 2021
DICT-MLM: Improved Multilingual Pre-Training using Bilingual Dictionaries
Aditi Chaudhary
K. Raman
Krishna Srinivasan
Jiecao Chen
27
24
0
23 Oct 2020
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
165
1,634
0
11 Oct 2017
1