ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.06874
  4. Cited By
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
  Representation

CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation

11 March 2021
J. Clark
Dan Garrette
Iulia Turc
John Wieting
ArXivPDFHTML

Papers citing "CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation"

50 / 143 papers shown
Title
Token-free Models for Sarcasm Detection
Token-free Models for Sarcasm Detection
Sumit Mamtani
Maitreya Sonawane
Kanika Agarwal
Nishanth Sanjeev
36
0
0
02 May 2025
LogicLearner: A Tool for the Guided Practice of Propositional Logic Proofs
LogicLearner: A Tool for the Guided Practice of Propositional Logic Proofs
Amogh Inamdar
U. Macar
Michel Vazirani
Michael Tarnow
Zarina Mustapha
Natalia Dittren
Sam Sadeh
Nakul Verma
Ansaf Salleb-Aouissi
LRM
35
0
0
25 Mar 2025
KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications
KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications
M. Bommarito
Daniel Martin Katz
Jillian Bommarito
34
1
0
21 Mar 2025
SuperBPE: Space Travel for Language Models
SuperBPE: Space Travel for Language Models
Alisa Liu
J. Hayase
Valentin Hofmann
Sewoong Oh
Noah A. Smith
Yejin Choi
43
1
0
17 Mar 2025
Cross-Lingual IPA Contrastive Learning for Zero-Shot NER
Jimin Sohn
David R. Mortensen
47
0
0
10 Mar 2025
Optimal word order for non-causal text generation with Large Language Models: the Spanish case
Optimal word order for non-causal text generation with Large Language Models: the Spanish case
Andrea Busto-Castiñeira
Silvia García-Méndez
Francisco de Arriba-Pérez
Francisco J. González Castaño
36
0
0
21 Feb 2025
MorphBPE: A Morpho-Aware Tokenizer Bridging Linguistic Complexity for Efficient LLM Training Across Morphologies
MorphBPE: A Morpho-Aware Tokenizer Bridging Linguistic Complexity for Efficient LLM Training Across Morphologies
Ehsaneddin Asgari
Yassine El Kheir
Mohammad Ali Sadraei Javaheri
53
0
0
02 Feb 2025
BinarySelect to Improve Accessibility of Black-Box Attack Research
BinarySelect to Improve Accessibility of Black-Box Attack Research
Shatarupa Ghosh
Jonathan Rusert
AAML
72
0
0
13 Dec 2024
ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles
ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles
Kayo Yin
Chinmay Singh
Fyodor O. Minakov
Vanessa Milan
Hal Daumé III
Cyril Zhang
Alex X. Lu
Danielle Bragg
20
2
0
08 Nov 2024
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
Langlin Huang
Mengyu Bu
Yang Feng
21
0
0
03 Nov 2024
Morphological Typology in BPE Subword Productivity and Language Modeling
Morphological Typology in BPE Subword Productivity and Language Modeling
Iñigo Parra
29
0
0
31 Oct 2024
From Babble to Words: Pre-Training Language Models on Continuous Streams
  of Phonemes
From Babble to Words: Pre-Training Language Models on Continuous Streams of Phonemes
Zébulon Goriely
Richard Diehl Martinez
Andrew Caines
Lisa Beinborn
P. Buttery
CLL
42
5
0
30 Oct 2024
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
30
2
0
28 Oct 2024
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
Nan Xu
Xuezhe Ma
LRM
36
3
0
18 Oct 2024
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based
  Language Models
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models
Kushal Tatariya
Vladimir Araujo
Thomas Bauwens
Miryam de Lhoneux
VLM
18
0
0
15 Oct 2024
Tokenization and Morphology in Multilingual Language Models: A
  Comparative Analysis of mT5 and ByT5
Tokenization and Morphology in Multilingual Language Models: A Comparative Analysis of mT5 and ByT5
Thao Anh Dang
Limor Raviv
Lukas Galke
23
1
0
15 Oct 2024
Gradient Routing: Masking Gradients to Localize Computation in Neural
  Networks
Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
Alex Cloud
Jacob Goldman-Wetzler
Evžen Wybitul
Joseph Miller
Alexander Matt Turner
21
2
0
06 Oct 2024
Examining Language Modeling Assumptions Using an Annotated Literary
  Dialect Corpus
Examining Language Modeling Assumptions Using an Annotated Literary Dialect Corpus
Craig Messner
Tom Lippincott
16
1
0
03 Oct 2024
Egalitarian Language Representation in Language Models: It All Begins
  with Tokenizers
Egalitarian Language Representation in Language Models: It All Begins with Tokenizers
Menan Velayuthan
Kengatharaiyer Sarveswaran
24
5
0
17 Sep 2024
DiffusionPen: Towards Controlling the Style of Handwritten Text
  Generation
DiffusionPen: Towards Controlling the Style of Handwritten Text Generation
Konstantina Nikolaidou
George Retsinas
Giorgos Sfikas
Marcus Liwicki
DiffM
33
3
0
09 Sep 2024
Predictability and Causality in Spanish and English Natural Language
  Generation
Predictability and Causality in Spanish and English Natural Language Generation
Andrea Busto-Castiñeira
Francisco J. González Castaño
Silvia García-Méndez
Francisco de Arriba-Pérez
CML
46
1
0
26 Aug 2024
LogogramNLP: Comparing Visual and Textual Representations of Ancient
  Logographic Writing Systems for NLP
LogogramNLP: Comparing Visual and Textual Representations of Ancient Logographic Writing Systems for NLP
Danlu Chen
Freda Shi
Aditi Agarwal
Jacobo Myerston
Taylor Berg-Kirkpatrick
29
2
0
08 Aug 2024
Semantics or spelling? Probing contextual word embeddings with
  orthographic noise
Semantics or spelling? Probing contextual word embeddings with orthographic noise
Jacob A. Matthews
John R. Starr
Marten van Schijndel
27
2
0
08 Aug 2024
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Chenze Shao
Fandong Meng
Jie Zhou
41
1
0
17 Jul 2024
MAGNET: Improving the Multilingual Fairness of Language Models with
  Adaptive Gradient-Based Tokenization
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Valentin Hoffman
Tomasz Limisiewicz
Yulia Tsvetkov
Noah A. Smith
38
4
0
11 Jul 2024
CharED: Character-wise Ensemble Decoding for Large Language Models
CharED: Character-wise Ensemble Decoding for Large Language Models
Kevin Gu
Eva Tuecke
Dmitriy Katz
R. Horesh
David Alvarez-Melis
Mikhail Yurochkin
23
2
0
25 Jun 2024
Segment Any Text: A Universal Approach for Robust, Efficient and
  Adaptable Sentence Segmentation
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation
Markus Frohmann
Igor Sterner
Ivan Vulić
Benjamin Minixhofer
Markus Schedl
VLM
41
13
0
24 Jun 2024
Zero-Shot Cross-Lingual NER Using Phonemic Representations for
  Low-Resource Languages
Zero-Shot Cross-Lingual NER Using Phonemic Representations for Low-Resource Languages
Jimin Sohn
Haeji Jung
Alex Cheng
Jooeon Kang
Yilin Du
David R. Mortensen
16
0
0
23 Jun 2024
Tokenization Falling Short: The Curse of Tokenization
Tokenization Falling Short: The Curse of Tokenization
Yekun Chai
Yewei Fang
Qiwei Peng
Xuhong Li
26
0
0
17 Jun 2024
Heidelberg-Boston @ SIGTYP 2024 Shared Task: Enhancing Low-Resource
  Language Analysis With Character-Aware Hierarchical Transformers
Heidelberg-Boston @ SIGTYP 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers
Frederick Riemenschneider
Kevin Krahn
22
2
0
30 May 2024
SoK: Leveraging Transformers for Malware Analysis
SoK: Leveraging Transformers for Malware Analysis
Pradip Kunwar
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
Elisa Bertino
90
0
0
27 May 2024
Zero-Shot Tokenizer Transfer
Zero-Shot Tokenizer Transfer
Benjamin Minixhofer
E. Ponti
Ivan Vulić
VLM
44
9
0
13 May 2024
SpaceByte: Towards Deleting Tokenization from Large Language Modeling
SpaceByte: Towards Deleting Tokenization from Large Language Modeling
Kevin Slagle
27
3
0
22 Apr 2024
EuSQuAD: Automatically Translated and Aligned SQuAD2.0 for Basque
EuSQuAD: Automatically Translated and Aligned SQuAD2.0 for Basque
Aitor García-Pablos
Naiara Pérez
Montse Cuadros
Jaione Bengoetxea
16
0
0
18 Apr 2024
Nostra Domina at EvaLatin 2024: Improving Latin Polarity Detection
  through Data Augmentation
Nostra Domina at EvaLatin 2024: Improving Latin Polarity Detection through Data Augmentation
Stephen Lawrence Bothwell
Abigail Swenor
David Chiang
22
1
0
11 Apr 2024
We're Calling an Intervention: Exploring Fundamental Hurdles in Adapting Language Models to Nonstandard Text
We're Calling an Intervention: Exploring Fundamental Hurdles in Adapting Language Models to Nonstandard Text
Aarohi Srivastava
David Chiang
54
0
0
10 Apr 2024
On the Effect of (Near) Duplicate Subwords in Language Modelling
On the Effect of (Near) Duplicate Subwords in Language Modelling
Anton Schäfer
Thomas Hofmann
Imanol Schlag
Tiago Pimentel
34
1
0
09 Apr 2024
Training LLMs over Neurally Compressed Text
Training LLMs over Neurally Compressed Text
Brian Lester
Jaehoon Lee
A. Alemi
Jeffrey Pennington
Adam Roberts
Jascha Narain Sohl-Dickstein
Noah Constant
32
6
0
04 Apr 2024
An Analysis of BPE Vocabulary Trimming in Neural Machine Translation
An Analysis of BPE Vocabulary Trimming in Neural Machine Translation
Marco Cognetta
Tatsuya Hiraoka
Naoaki Okazaki
Rico Sennrich
Yuval Pinter
29
2
0
30 Mar 2024
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual
  Language Modeling
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
Tomasz Limisiewicz
Terra Blevins
Hila Gonen
Orevaoghene Ahia
Luke Zettlemoyer
30
12
0
15 Mar 2024
Unpacking Tokenization: Evaluating Text Compression and its Correlation
  with Model Performance
Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance
Omer Goldman
Avi Caciularu
Matan Eyal
Kris Cao
Idan Szpektor
Reut Tsarfaty
43
22
0
10 Mar 2024
Evaluating the Elementary Multilingual Capabilities of Large Language
  Models with MultiQ
Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ
Carolin Holtermann
Paul Röttger
Timm Dill
Anne Lauscher
ELM
LRM
29
22
0
06 Mar 2024
Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Nguyen Nguyen
Yapeng Tian
Chenliang Xu
45
1
0
27 Feb 2024
Mitigating the Linguistic Gap with Phonemic Representations for Robust
  Multilingual Language Understanding
Mitigating the Linguistic Gap with Phonemic Representations for Robust Multilingual Language Understanding
Haeji Jung
Changdae Oh
Jooeon Kang
Jimin Sohn
Kyungwoo Song
Jinkyu Kim
David R. Mortensen
22
0
0
22 Feb 2024
Knowledge of Pretrained Language Models on Surface Information of Tokens
Knowledge of Pretrained Language Models on Surface Information of Tokens
Tatsuya Hiraoka
Naoaki Okazaki
19
1
0
15 Feb 2024
Pixel Sentence Representation Learning
Pixel Sentence Representation Learning
Chenghao Xiao
Zhuoxu Huang
Danlu Chen
G. Hudson
Yizhi Li
Haoran Duan
Chenghua Lin
Jie Fu
Jungong Han
Noura Al Moubayed
SSL
4
2
0
13 Feb 2024
Modular Adaptation of Multilingual Encoders to Written Swiss German
  Dialect
Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect
Jannis Vamvas
Noëmi Aepli
Rico Sennrich
27
0
0
25 Jan 2024
MambaByte: Token-free Selective State Space Model
MambaByte: Token-free Selective State Space Model
Junxiong Wang
Tushaar Gangavarapu
Jing Nathan Yan
Alexander M. Rush
Mamba
25
34
0
24 Jan 2024
Anisotropy Is Inherent to Self-Attention in Transformers
Anisotropy Is Inherent to Self-Attention in Transformers
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
13
16
0
22 Jan 2024
Phishing Website Detection through Multi-Model Analysis of HTML Content
Phishing Website Detection through Multi-Model Analysis of HTML Content
Furkan Çolhak
Mert İlhan Ecevit
Bilal Emir Uçar
Reiner Creutzburg
Hasan Dag
16
7
0
09 Jan 2024
123
Next