Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.02550
Cited By
You should evaluate your language model on marginal likelihood over tokenisations
6 September 2021
Kris Cao
Laura Rimell
Re-assign community
ArXiv
PDF
HTML
Papers citing
"You should evaluate your language model on marginal likelihood over tokenisations"
5 / 5 papers shown
Title
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
João Loula
Benjamin LeBrun
Li Du
Ben Lipkin
Clemente Pasti
...
Ryan Cotterel
Vikash K. Mansinghka
Alexander K. Lew
Tim Vieira
Timothy J. O'Donnell
32
1
0
17 Apr 2025
What is the best recipe for character-level encoder-only modelling?
Kris Cao
32
2
0
09 May 2023
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
32
46
0
14 Jul 2022
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Sabrina J. Mielke
Zaid Alyafeai
Elizabeth Salesky
Colin Raffel
Manan Dey
...
Arun Raja
Chenglei Si
Wilson Y. Lee
Benoît Sagot
Samson Tan
30
140
0
20 Dec 2021
Morphology Matters: A Multilingual Language Modeling Analysis
Hyunji Hayley Park
Katherine J. Zhang
Coleman Haley
K. Steimel
Han Liu
Lane Schwartz
42
47
0
11 Dec 2020
1