Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.11628
Cited By
Learn Your Tokens: Word-Pooled Tokenization for Language Modeling
17 October 2023
Avijit Thawani
Saurabh Ghanekar
Xiaoyuan Zhu
Jay Pujara
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learn Your Tokens: Word-Pooled Tokenization for Language Modeling"
3 / 3 papers shown
Title
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
Hicham El Boukkouri
Olivier Ferret
Thomas Lavergne
Hiroshi Noji
Pierre Zweigenbaum
Junichi Tsujii
66
155
0
20 Oct 2020
SberQuAD -- Russian Reading Comprehension Dataset: Description and Analysis
Pavel Efimov
Andrey Chertok
Leonid Boytsov
Pavel Braslavski
58
59
0
20 Dec 2019
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,724
0
26 Sep 2016
1