Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.15562
Cited By
v1
v2
v3 (latest)
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
31 December 2020
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
Sebastian Ruder
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"UNKs Everywhere: Adapting Multilingual Language Models to New Scripts"
50 / 102 papers shown
Happiness is Sharing a Vocabulary: A Study of Transliteration Methods
Haeji Jung
Jinju Kim
Kyungjin Kim
Youjeong Roh
David R. Mortensen
176
1
0
12 Oct 2025
Beyond WER: Probing Whisper's Sub-token Decoder Across Diverse Language Resource Levels
Siyu Liang
Nicolas Ballier
Gina-Anne Levow
Richard Wright
168
1
0
29 Sep 2025
One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers
Diana Abagyan
Alejandro Salamanca
Andres Felipe Cruz-Salinas
Kris Cao
Hangyu Lin
Acyr Locatelli
Marzieh Fadaee
Ahmet Üstün
Sara Hooker
CLL
414
8
0
12 Jun 2025
Limited-Resource Adapters Are Regularizers, Not Linguists
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Marcell Richard Fekete
Nathaniel R. Robinson
Ernests Lavrinovics
E. Djeride Jean-Baptiste
Mary Dabre
Johannes Bjerva
Heather Lent
231
3
0
30 May 2025
Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead
Jesujoba Oluwadara Alabi
Michael A. Hedderich
David Ifeoluwa Adelani
Dietrich Klakow
557
11
0
27 May 2025
DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer
Sona Elza Simon
Preethi Jyothi
VLM
363
1
0
21 May 2025
HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization
Enes Özeren
Yihong Liu
Hinrich Schütze
315
1
0
21 Apr 2025
Overcoming Vocabulary Constraints with Pixel-level Fallback
Jonas F. Lotz
Hendra Setiawan
Stephan Peitz
Yova Kementchedjhieva
370
4
0
02 Apr 2025
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
672
5
0
18 Dec 2024
Prompting with Phonemes: Enhancing LLMs' Multilinguality for Non-Latin Script Languages
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Hoang Nguyen
Khyati Mahajan
Vikas Yadav
Philip S. Yu
Philip S. Yu
Masoud Hashemi
Rishabh Maheshwary
524
4
0
04 Nov 2024
The Zeno's Paradox of `Low-Resource' Languages
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
H. Nigatu
A. Tonja
Benjamin Rosman
Thamar Solorio
Monojit Choudhury
915
22
0
28 Oct 2024
Goldfish: Monolingual Language Models for 350 Languages
Tyler A. Chang
Catherine Arnett
Zhuowen Tu
Benjamin Bergen
LRM
321
23
0
19 Aug 2024
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment
Yongxin Huang
Kexin Wang
Goran Glavaš
Iryna Gurevych
391
3
0
20 Jul 2024
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Valentin Hoffman
Tomasz Limisiewicz
Yulia Tsvetkov
Noah A. Smith
390
24
0
11 Jul 2024
Script-Agnostic Language Identification
Milind Agarwal
Joshua Otten
Antonios Anastasopoulos
315
0
0
25 Jun 2024
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Trinh Pham
Khoi M. Le
Luu Anh Tuan
416
9
0
14 Jun 2024
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning
Hanqing Wang
Zeguan Xiao
Shuo Wang
Guanhua Chen
Guanhua Chen
463
64
0
13 Jun 2024
Targeted Multilingual Adaptation for Low-resource Language Families
C.M. Downey
Terra Blevins
Dhwani Serai
Dwija Parikh
Shane Steinert-Threlkeld
342
11
0
20 May 2024
TransMI: A Framework to Create Strong Baselines from Multilingual Pretrained Language Models for Transliterated Data
International Conference on Computational Linguistics (COLING), 2024
Yihong Liu
Chunlan Ma
Haotian Ye
Hinrich Schütze
303
7
0
16 May 2024
Unknown Script: Impact of Script on Cross-Lingual Transfer
Wondimagegnhue Tufa
Ilia Markov
Piek Vossen
454
3
0
29 Apr 2024
TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages
Aleksei Dorkin
Kairit Sirts
162
3
0
19 Apr 2024
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model
Osvaldo Luamba Quinjica
David Ifeoluwa Adelani
229
2
0
03 Apr 2024
Bailong: Bilingual Transfer Learning based on QLoRA and Zip-tie Embedding
Lung-Chuan Chen
Zong-Ru Li
ALM
319
1
0
01 Apr 2024
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
Tomasz Limisiewicz
Terra Blevins
Hila Gonen
Orevaoghene Ahia
Luke Zettlemoyer
394
32
0
15 Mar 2024
Teaching Large Language Models an Unseen Language on the Fly
Chen Zhang
Xiao Liu
Jiuheng Lin
Yansong Feng
339
34
0
29 Feb 2024
The Hidden Space of Transformer Language Adapters
Jesujoba Oluwadara Alabi
Marius Mosbach
Matan Eyal
Dietrich Klakow
Mor Geva
466
17
1
20 Feb 2024
Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning
Bang-ju Yang
Yong Dai
Xuxin Cheng
Yaowei Li
Asif Raza
Yuexian Zou
VLM
299
9
0
30 Jan 2024
Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect
Jannis Vamvas
Noëmi Aepli
Rico Sennrich
330
2
0
25 Jan 2024
MaLA-500: Massive Language Adaptation of Large Language Models
Peiqin Lin
Shaoxiong Ji
Jörg Tiedemann
Marcely Zanon Boito
Hinrich Schütze
ELM
456
29
0
24 Jan 2024
LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Muhammad Farid Adilazuarda
Samuel Cahyawijaya
Alham Fikri Aji
Genta Indra Winata
Ayu Purwarianti
631
8
0
11 Jan 2024
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment
Lingling Xu
Haoran Xie
S. J. Qin
Xiaohui Tao
F. Wang
388
314
0
19 Dec 2023
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Clifton A. Poth
Hannah Sterz
Indraneil Paul
Sukannya Purkayastha
Leon Arne Engländer
Timo Imhof
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
Jonas Pfeiffer
264
85
0
18 Nov 2023
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining
Yihong Liu
Peiqin Lin
Mingyang Wang
Hinrich Schütze
304
38
0
15 Nov 2023
Extending Multilingual Machine Translation through Imitation Learning
Wen Lai
Viktor Hangya
Kangyang Luo
Alexander Fraser
LRM
CLL
595
5
0
14 Nov 2023
ZGUL: Zero-shot Generalization to Unseen Languages using Multi-source Ensembling of Language Adapters
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Vipul Rathore
Rajdeep Dhingra
Parag Singla
Mausam
241
11
0
25 Oct 2023
The Less the Merrier? Investigating Language Representation in Multilingual Models
H. Nigatu
A. Tonja
Jugal Kalita
283
7
0
20 Oct 2023
Automatic Anonymization of Swiss Federal Supreme Court Rulings
Joel Niklaus
Robin Mamié
Matthias Sturmer
Daniel Brunner
Marcel Gygli
330
4
0
07 Oct 2023
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
David Ifeoluwa Adelani
Hannah Liu
Xiaoyu Shen
Nikita Vassilyev
Jesujoba Oluwadara Alabi
Yanke Mao
Haonan Gao
Annie En-Shiun Lee
ELM
482
145
0
14 Sep 2023
OYXOY: A Modern NLP Test Suite for Modern Greek
Findings (Findings), 2023
Konstantinos Kogkalidis
S. Chatzikyriakidis
Eirini Chrysovalantou Giannikouri
Vassiliki Katsouli
Christina Klironomou
...
Dimitris Papadakis
Thelka Pasparaki
Erofili Psaltaki
E. Sakellariou
Hara Soupiona
220
0
0
13 Sep 2023
Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages
C.M. Downey
Terra Blevins
Nora Goldfine
Shane Steinert-Threlkeld
303
14
0
09 Sep 2023
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Lucas Bandarkar
Davis Liang
Benjamin Muller
Mikel Artetxe
Satya Narayan Shukla
Don Husa
Naman Goyal
Abhinandan Krishnan
Luke Zettlemoyer
Madian Khabsa
459
275
0
31 Aug 2023
Efficient Domain Adaptation of Sentence Embeddings Using Adapters
Recent Advances in Natural Language Processing (RANLP), 2023
Tim Schopf
Dennis Schneider
Florian Matthes
642
9
0
06 Jul 2023
Improving Language Plasticity via Pretraining with Active Forgetting
Neural Information Processing Systems (NeurIPS), 2023
Yihong Chen
Kelly Marchisio
Roberta Raileanu
David Ifeoluwa Adelani
Pontus Stenetorp
Sebastian Riedel
Mikel Artetx
KELM
AI4CE
CLL
489
41
0
03 Jul 2023
Cross-Lingual Transfer with Target Language-Ready Task Adapters
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Marinela Parović
Alan Ansell
Ivan Vulić
Anna Korhonen
223
14
0
05 Jun 2023
MultiLegalPile: A 689GB Multilingual Legal Corpus
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Joel Niklaus
Veton Matoshi
Matthias Sturmer
Ilias Chalkidis
Daniel E. Ho
AILaw
ELM
473
69
0
03 Jun 2023
Distilling Efficient Language-Specific Models for Cross-Lingual Transfer
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Alan Ansell
Edoardo Ponti
Anna Korhonen
Ivan Vulić
268
6
0
02 Jun 2023
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Tarek Naous
Michael Joseph Ryan
Alan Ritter
Wei Xu
667
158
0
23 May 2023
MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Cheikh M. Bamba Dione
David Ifeoluwa Adelani
Peter Nabende
Jesujoba Oluwadara Alabi
Thapelo Sindane
...
Seydou T. Traoré
C. Uchechukwu
Aliyu Yusuf
M. Abdullahi
Dietrich Klakow
312
22
0
23 May 2023
Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction
Yang Chen
Vedaant Shah
Alan Ritter
422
14
0
23 May 2023
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ayyoob Imani
Peiqin Lin
Amir Hossein Kargaran
Silvia Severini
Masoud Jalili Sabet
...
Chunlan Ma
Helmut Schmid
Marcely Zanon Boito
François Yvon
Hinrich Schütze
ALM
LRM
397
146
0
20 May 2023
1
2
3
Next
Page 1 of 3