Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.16956
Cited By
On Multilingual Encoder Language Model Compression for Low-Resource Languages
22 May 2025
Daniil Gurgurov
Michal Gregor
Josef van Genabith
Simon Ostermann
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On Multilingual Encoder Language Model Compression for Low-Resource Languages"
25 / 25 papers shown
Title
Extracting General-use Transformers for Low-resource Languages via Knowledge Distillation
Jan Christian Blaise Cruz
Alham Fikri Aji
65
2
0
22 Jan 2025
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Amir Hossein Kargaran
François Yvon
Hinrich Schutze
VLM
67
7
0
31 Oct 2024
The Privileged Students: On the Value of Initialization in Multilingual Knowledge Distillation
Haryo Akbarianto Wibowo
Thamar Solorio
Alham Fikri Aji
53
3
0
24 Jun 2024
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
David Ifeoluwa Adelani
Hannah Liu
Xiaoyu Shen
Nikita Vassilyev
Jesujoba Oluwadara Alabi
Yanke Mao
Haonan Gao
Annie En-Shiun Lee
ELM
60
72
0
14 Sep 2023
Distilling Efficient Language-Specific Models for Cross-Lingual Transfer
Alan Ansell
Edoardo Ponti
Anna Korhonen
Ivan Vulić
52
6
0
02 Jun 2023
An Efficient Multilingual Language Model Compression through Vocabulary Trimming
Asahi Ushio
Yi Zhou
Jose Camacho-Collados
71
8
0
24 May 2023
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer
Akari Asai
Sneha Kudugunta
Xinyan Velocity Yu
Terra Blevins
Hila Gonen
Machel Reid
Yulia Tsvetkov
Sebastian Ruder
Hannaneh Hajishirzi
79
59
0
24 May 2023
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Ayyoob Imani
Peiqin Lin
Amir Hossein Kargaran
Silvia Severini
Masoud Jalili Sabet
...
Chunlan Ma
Helmut Schmid
André F. T. Martins
François Yvon
Hinrich Schütze
ALM
LRM
64
104
0
20 May 2023
SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)
Shamsuddeen Hassan Muhammad
Idris Abdulmumin
Seid Muhie Yimam
David Ifeoluwa Adelani
Ibrahim Said Ahmad
N. Ousidhoum
Abinew Ali Ayele
Saif M. Mohammad
Meriem Beloucif
Sebastian Ruder
49
69
0
13 Apr 2023
AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
Shamsuddeen Hassan Muhammad
Idris Abdulmumin
Abinew Ali Ayele
N. Ousidhoum
David Ifeoluwa Adelani
...
Hailu Beshada Balcha
S. Chala
Hagos Tesfahun Gebremichael
Bernard Opoku
Steven Arthur
56
87
0
17 Feb 2023
No Language Left Behind: Scaling Human-Centered Machine Translation
Nllb team
Marta R. Costa-jussá
James Cross
Onur cCelebi
Maha Elbayad
...
Alexandre Mourachko
C. Ropers
Safiyyah Saleem
Holger Schwenk
Jeff Wang
MoE
121
1,220
0
11 Jul 2022
Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Rendi Chevi
Radityo Eko Prasojo
Alham Fikri Aji
34
6
0
03 Jan 2022
Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation
Taehyeon Kim
Jaehoon Oh
Nakyil Kim
Sangwook Cho
Se-Young Yun
32
232
0
19 May 2021
FastFormers: Highly Efficient Transformer Models for Natural Language Understanding
Young Jin Kim
Hany Awadalla
AI4CE
50
44
0
26 Oct 2020
Load What You Need: Smaller Versions of Multilingual BERT
Amine Abdaoui
Camille Pradel
Grégoire Sigel
68
74
0
12 Oct 2020
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
Sebastian Ruder
87
618
0
30 Apr 2020
The State and Fate of Linguistic Diversity and Inclusion in the NLP World
Pratik M. Joshi
Sebastin Santy
A. Budhiraja
Kalika Bali
Monojit Choudhury
LMTD
64
822
0
20 Apr 2020
DynaBERT: Dynamic BERT with Adaptive Width and Depth
Lu Hou
Zhiqi Huang
Lifeng Shang
Xin Jiang
Xiao Chen
Qun Liu
MQ
56
322
0
08 Apr 2020
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
Junjie Hu
Sebastian Ruder
Aditya Siddhant
Graham Neubig
Orhan Firat
Melvin Johnson
ELM
133
966
0
24 Mar 2020
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
158
6,496
0
05 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
71
649
0
01 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
126
7,437
0
02 Oct 2019
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
101
833
0
25 Aug 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
945
93,936
0
11 Oct 2018
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
236
19,523
0
09 Mar 2015
1