Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.10691
Cited By
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
15 March 2024
Tomasz Limisiewicz
Terra Blevins
Hila Gonen
Orevaoghene Ahia
Luke Zettlemoyer
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling"
7 / 7 papers shown
Title
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
Langlin Huang
Mengyu Bu
Yang Feng
21
0
0
03 Nov 2024
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
30
2
0
28 Oct 2024
Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
Buu Phan
Brandon Amos
Itai Gat
Marton Havasi
Matthew Muckley
Karen Ullrich
45
1
0
11 Oct 2024
Natural Language Processing RELIES on Linguistics
Juri Opitz
Shira Wein
Nathan Schneider
AI4CE
44
7
0
09 May 2024
Morphology Matters: A Multilingual Language Modeling Analysis
Hyunji Hayley Park
Katherine J. Zhang
Coleman Haley
K. Steimel
Han Liu
Lane Schwartz
39
47
0
11 Dec 2020
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
58
65
0
24 Oct 2020
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
165
1,630
0
11 Oct 2017
1