Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2305.05480
Cited By
v1
v2
v3 (latest)
Effects of sub-word segmentation on performance of transformer language models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
9 May 2023
Jue Hou
Anisia Katinskaia
Anh Vu
R. Yangarber
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Effects of sub-word segmentation on performance of transformer language models"
5 / 5 papers shown
Title
MoVoC: Morphology-Aware Subword Construction for Geez Script Languages
Hailay Teklehaymanot
Dren Fazlija
Wolfgang Nejdl
77
0
0
10 Sep 2025
Rethinking Tokenization for Rich Morphology: The Dominance of Unigram over BPE and Morphological Alignment
Saketh Reddy Vemula
Sandipan Dandapat
D. Sharma
Parameswari Krishnamurthy
203
0
0
11 Aug 2025
Canonical Autoregressive Generation
Ivi Chatzi
N. C. Benz
Stratis Tsirtsis
Manuel Gomez Rodriguez
119
1
0
06 Jun 2025
Unsupervised Morphological Tree Tokenizer
Qingyang Zhu
Xiang Hu
Pengyu Ji
Wei Wu
Kewei Tu
254
0
0
21 Jun 2024
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
Tomasz Limisiewicz
Terra Blevins
Hila Gonen
Orevaoghene Ahia
Luke Zettlemoyer
269
28
0
15 Mar 2024
1