Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2510.05183
Cited By
Aneurysm Growth Time Series Reconstruction Using Physics-informed Autoencoder
5 October 2025
Jiacheng Wu
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Aneurysm Growth Time Series Reconstruction Using Physics-informed Autoencoder"
9 / 9 papers shown
IndicSuperTokenizer: An Optimized Tokenizer for Indic Multilingual LLMs
Souvik Rana
Arul Menezes
Ashish Kulkarni
Chandra Khatri
Shubham Agarwal
117
1
0
05 Nov 2025
Explaining and Mitigating Crosslingual Tokenizer Inequities
Catherine Arnett
T. Chang
Stella Biderman
Benjamin Bergen
163
1
0
24 Oct 2025
TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar
Yinxi Li
Yuntian Deng
Pengyu Nie
116
0
0
16 Oct 2025
Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training
Woojin Chung
Jeonghoon Kim
201
1
0
21 Aug 2025
SupraTok: Cross-Boundary Tokenization for Enhanced Language Model Performance
Andrei-Valentin Tanase
Elena Pelican
136
1
0
16 Aug 2025
Self-Organizing Language
P. Myles Eugenio
Anthony Beavers
144
0
0
29 Jun 2025
Entropy-Driven Pre-Tokenization for Byte-Pair Encoding
Yifan Hu
Frank Liang
Dachuan Zhao
Jonathan Geuter
Varshini Reddy
Craig W. Schmidt
Chris Tanner
259
1
0
18 Jun 2025
BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization
Sander Land
Catherine Arnett
184
5
0
30 May 2025
Tokenization is Sensitive to Language Variation
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Anna Wegmann
Dong Nguyen
David Jurgens
434
6
0
21 Feb 2025
1