Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.07540
Cited By
Evaluating Various Tokenizers for Arabic Text Classification
14 June 2021
Zaid Alyafeai
Maged S. Al-Shaibani
Mustafa Ghaleb
Irfan Ahmad
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Evaluating Various Tokenizers for Arabic Text Classification"
9 / 9 papers shown
Title
Poem Meter Classification of Recited Arabic Poetry: Integrating High-Resource Systems for a Low-Resource Task
Maged S. Al-Shaibani
Zaid Alyafeai
Irfan Ahmad
36
0
0
16 Apr 2025
Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
Alex Cloud
Jacob Goldman-Wetzler
Evžen Wybitul
Joseph Miller
Alexander Matt Turner
28
4
0
06 Oct 2024
Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali
Nishant Luitel
Nirajan Bekoju
Anand Kumar Sah
Subarna Shakya
50
0
0
28 Apr 2024
Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models
M. Alrefaie
Nour Eldin Morsy
Nada Samir
23
6
0
17 Mar 2024
MorphPiece : A Linguistic Tokenizer for Large Language Models
Jeffrey Hsu
15
3
0
14 Jul 2023
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Sabrina J. Mielke
Zaid Alyafeai
Elizabeth Salesky
Colin Raffel
Manan Dey
...
Arun Raja
Chenglei Si
Wilson Y. Lee
Benoît Sagot
Samson Tan
23
140
0
20 Dec 2021
ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic
Muhammad Abdul-Mageed
AbdelRahim Elmadany
El Moatez Billah Nagoudi
VLM
60
447
0
27 Dec 2020
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,743
0
26 Sep 2016
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,253
0
16 Jan 2013
1