Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.09435
Cited By
Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation
17 March 2022
Xinyi Wang
Sebastian Ruder
Graham Neubig
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"
14 / 14 papers shown
Title
A Benchmark for Learning to Translate a New Language from One Grammar Book
Garrett Tanzer
Mirac Suzgun
Chenguang Xi
Dan Jurafsky
Luke Melas-Kyriazi
24
51
0
28 Sep 2023
Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation
Wen Lai
Alexandra Chronopoulou
Alexander M. Fraser
30
4
0
22 May 2023
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Ayyoob Imani
Peiqin Lin
Amir Hossein Kargaran
Silvia Severini
Masoud Jalili Sabet
...
Chunlan Ma
Helmut Schmid
André F. T. Martins
François Yvon
Hinrich Schütze
ALM
LRM
29
95
0
20 May 2023
Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Alex Jones
Isaac Caswell
Ishan Saxena
Orhan Firat
21
8
0
27 Mar 2023
Language Embeddings Sometimes Contain Typological Generalizations
Robert Östling
Murathan Kurfali
NAI
19
9
0
19 Jan 2023
Improving Bilingual Lexicon Induction with Cross-Encoder Reranking
Yaoyiran Li
Fangyu Liu
Ivan Vulić
Anna Korhonen
29
10
0
30 Oct 2022
Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models
Harshita Diddee
Sandipan Dandapat
Monojit Choudhury
T. Ganu
Kalika Bali
27
5
0
27 Oct 2022
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages
Paul Röttger
Debora Nozza
Federico Bianchi
Dirk Hovy
18
10
0
20 Oct 2022
The first neural machine translation system for the Erzya language
David Dale
63
7
0
19 Sep 2022
Probing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders
Ivan Vulić
Goran Glavavs
Fangyu Liu
Nigel Collier
E. Ponti
Anna Korhonen
17
8
0
30 Apr 2022
Systematic Inequalities in Language Technology Performance across the World's Languages
Damián E. Blasi
Antonios Anastasopoulos
Graham Neubig
111
131
0
13 Oct 2021
Named Entity Recognition and Classification on Historical Documents: A Survey
Maud Ehrmann
Ahmed Hamdi
Elvys Linhares Pontes
Matteo Romanello
A. Doucet
47
108
0
23 Sep 2021
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
119
165
0
24 Oct 2020
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
Peng Qi
Yuhao Zhang
Yuhui Zhang
Jason Bolton
Christopher D. Manning
AI4TS
199
1,653
0
16 Mar 2020
1