Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.15020
Cited By
An Efficient Multilingual Language Model Compression through Vocabulary Trimming
24 May 2023
Asahi Ushio
Yi Zhou
Jose Camacho-Collados
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Efficient Multilingual Language Model Compression through Vocabulary Trimming"
14 / 14 papers shown
Title
A comparison of data filtering techniques for English-Polish LLM-based machine translation in the biomedical domain
Jorge del Pozo Lérida
Kamil Kojs
János Máté
Mikołaj Antoni Barański
Christian Hardmeier
40
0
0
27 Jan 2025
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
43
0
0
07 Oct 2024
BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer Training
Pavel Chizhov
Catherine Arnett
Elizaveta Korotkova
Ivan P. Yamshchikov
32
2
0
06 Sep 2024
An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference
Atsuki Yamaguchi
Aline Villavicencio
Nikolaos Aletras
6
6
0
16 Feb 2024
Mitigating Language-Dependent Ethnic Bias in BERT
Jaimeen Ahn
Alice H. Oh
118
90
0
13 Sep 2021
Larger-Scale Transformers for Multilingual Masked Language Modeling
Naman Goyal
Jingfei Du
Myle Ott
Giridhar Anantharaman
Alexis Conneau
88
98
0
02 May 2021
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond
Francesco Barbieri
Luis Espinosa Anke
Jose Camacho-Collados
76
211
0
25 Apr 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
254
374
0
28 Feb 2021
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
205
121
0
23 Jan 2021
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
106
150
0
24 Oct 2020
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
58
56
0
24 Oct 2020
SberQuAD -- Russian Reading Comprehension Dataset: Description and Analysis
Pavel Efimov
Andrey Chertok
Leonid Boytsov
Pavel Braslavski
52
56
0
20 Dec 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
158
1,630
0
11 Oct 2017
1