ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.13626
  4. Cited By
ByT5: Towards a token-free future with pre-trained byte-to-byte models

ByT5: Towards a token-free future with pre-trained byte-to-byte models

28 May 2021
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
ArXivPDFHTML

Papers citing "ByT5: Towards a token-free future with pre-trained byte-to-byte models"

50 / 87 papers shown
Title
Differentiating Emigration from Return Migration of Scholars Using Name-Based Nationality Detection Models
Differentiating Emigration from Return Migration of Scholars Using Name-Based Nationality Detection Models
Faeze Ghorbanpour
Thiago Zordan Malaguth
Aliakbar Akbaritabar
24
0
0
09 May 2025
Crosslingual Reasoning through Test-Time Scaling
Crosslingual Reasoning through Test-Time Scaling
Zheng-Xin Yong
Muhammad Farid Adilazuarda
Jonibek Mansurov
Ruochen Zhang
Niklas Muennighoff
Carsten Eickhoff
Genta Indra Winata
Julia Kreutzer
Stephen H. Bach
Alham Fikri Aji
LRM
ELM
116
0
0
08 May 2025
Token-free Models for Sarcasm Detection
Token-free Models for Sarcasm Detection
Sumit Mamtani
Maitreya Sonawane
Kanika Agarwal
Nishanth Sanjeev
36
0
0
02 May 2025
RepText: Rendering Visual Text via Replicating
RepText: Rendering Visual Text via Replicating
H. Wang
Y. Xu
Y. Li
J. Li
Chaowei Zhang
J. Wang
Kejia Yang
Z. Chen
VLM
66
0
0
28 Apr 2025
Cross-Tokenizer Distillation via Approximate Likelihood Matching
Cross-Tokenizer Distillation via Approximate Likelihood Matching
Benjamin Minixhofer
Ivan Vulić
E. Ponti
119
0
0
25 Mar 2025
SuperBPE: Space Travel for Language Models
SuperBPE: Space Travel for Language Models
Alisa Liu
J. Hayase
Valentin Hofmann
Sewoong Oh
Noah A. Smith
Yejin Choi
43
2
0
17 Mar 2025
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Markus J. Buehler
AI4CE
35
1
0
04 Jan 2025
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
Langlin Huang
Mengyu Bu
Yang Feng
21
0
0
03 Nov 2024
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
30
2
0
28 Oct 2024
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
Nan Xu
Xuezhe Ma
LRM
36
3
0
18 Oct 2024
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based
  Language Models
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models
Kushal Tatariya
Vladimir Araujo
Thomas Bauwens
Miryam de Lhoneux
VLM
29
0
0
15 Oct 2024
LeanAgent: Lifelong Learning for Formal Theorem Proving
LeanAgent: Lifelong Learning for Formal Theorem Proving
Adarsh Kumarappan
Mo Tiwari
Peiyang Song
Robert Joseph George
Chaowei Xiao
Anima Anandkumar
CLL
LLMAG
LRM
70
8
0
08 Oct 2024
Cogs in a Machine, Doing What They're Meant to Do -- The AMI Submission
  to the WMT24 General Translation Task
Cogs in a Machine, Doing What They're Meant to Do -- The AMI Submission to the WMT24 General Translation Task
Atli Jasonarson
Hinrik Hafsteinsson
Bjarki Ármannsson
Steinþór Steingrímsson
SyDa
32
2
0
04 Oct 2024
Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?
Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?
Seth Aycock
David Stap
Di Wu
Christof Monz
Khalil Simaán
29
2
0
27 Sep 2024
Imagine yourself: Tuning-Free Personalized Image Generation
Imagine yourself: Tuning-Free Personalized Image Generation
Zecheng He
Bo Sun
Felix Juefei-Xu
Haoyu Ma
Ankit Ramchandani
...
Ning Zhang
Peizhao Zhang
Roshan Sumbaly
Peter Vajda
Animesh Sinha
DiffM
24
16
0
20 Sep 2024
Sample-Efficient Diffusion for Text-To-Speech Synthesis
Sample-Efficient Diffusion for Text-To-Speech Synthesis
Justin Lovelace
Soham Ray
Kwangyoun Kim
Kilian Q. Weinberger
Felix Wu
32
2
0
01 Sep 2024
Advancing Post-OCR Correction: A Comparative Study of Synthetic Data
Advancing Post-OCR Correction: A Comparative Study of Synthetic Data
Shuhao Guan
Derek Greene
26
6
0
05 Aug 2024
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
Jian Ma
Yonglin Deng
Chen Chen
H. Lu
Zhenyu Yang
Zhenyu Yang
VLM
DiffM
82
6
0
02 Jul 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
37
3
0
28 May 2024
Identifying and Aligning Medical Claims Made on Social Media with
  Medical Evidence
Identifying and Aligning Medical Claims Made on Social Media with Medical Evidence
Anthony James Hughes
Xingyi Song
18
1
0
18 May 2024
Neural Semantic Parsing with Extremely Rich Symbolic Meaning
  Representations
Neural Semantic Parsing with Extremely Rich Symbolic Meaning Representations
Xiao Zhang
Gosse Bouma
Johan Bos
NAI
33
0
0
19 Apr 2024
Gaining More Insight into Neural Semantic Parsing with Challenging
  Benchmarks
Gaining More Insight into Neural Semantic Parsing with Challenging Benchmarks
Xiao Zhang
Chunliu Wang
Rik van Noord
Johan Bos
21
3
0
12 Apr 2024
An Analysis of BPE Vocabulary Trimming in Neural Machine Translation
An Analysis of BPE Vocabulary Trimming in Neural Machine Translation
Marco Cognetta
Tatsuya Hiraoka
Naoaki Okazaki
Rico Sennrich
Yuval Pinter
29
2
0
30 Mar 2024
Advancing Generative AI for Portuguese with Open Decoder Gervásio PT*
Advancing Generative AI for Portuguese with Open Decoder Gervásio PT*
Rodrigo Santos
Joao Silva
Luís Gomes
João Rodrigues
António Branco
44
10
0
29 Feb 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
MELA: Multilingual Evaluation of Linguistic Acceptability
MELA: Multilingual Evaluation of Linguistic Acceptability
Ziyin Zhang
Yikang Liu
Wei Huang
Junyu Mao
Rui Wang
Hai Hu
22
3
0
15 Nov 2023
Leveraging LLMs for Synthesizing Training Data Across Many Languages in
  Multilingual Dense Retrieval
Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval
Nandan Thakur
Jianmo Ni
Gustavo Hernández Ábrego
John Wieting
Jimmy J. Lin
Daniel Matthew Cer
RALM
29
12
0
10 Nov 2023
Generating Pragmatic Examples to Train Neural Program Synthesizers
Generating Pragmatic Examples to Train Neural Program Synthesizers
Saujas Vaduguru
Daniel Fried
Yewen Pu
NAI
13
5
0
09 Nov 2023
Analyzing Cognitive Plausibility of Subword Tokenization
Analyzing Cognitive Plausibility of Subword Tokenization
Lisa Beinborn
Yuval Pinter
27
17
0
20 Oct 2023
Text-to-OverpassQL: A Natural Language Interface for Complex Geodata
  Querying of OpenStreetMap
Text-to-OverpassQL: A Natural Language Interface for Complex Geodata Querying of OpenStreetMap
Michael Staniek
Raphael Schumann
Maike Zufle
Stefan Riezler
30
6
0
30 Aug 2023
SDXL: Improving Latent Diffusion Models for High-Resolution Image
  Synthesis
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
60
2,121
0
04 Jul 2023
Byte-Level Grammatical Error Correction Using Synthetic and Curated
  Corpora
Byte-Level Grammatical Error Correction Using Synthetic and Curated Corpora
Svanhvít Lilja Ingólfsdóttir
Pétur Orri Ragnarsson
H. Jónsson
Haukur Barri Símonarson
Vilhjálmur Þorsteinsson
Vésteinn Snæbjarnarson
SyDa
30
9
0
29 May 2023
TranSFormer: Slow-Fast Transformer for Machine Translation
TranSFormer: Slow-Fast Transformer for Machine Translation
Bei Li
Yi Jing
Xu Tan
Zhen Xing
Tong Xiao
Jingbo Zhu
41
7
0
26 May 2023
Sāmayik: A Benchmark and Dataset for English-Sanskrit Translation
Sāmayik: A Benchmark and Dataset for English-Sanskrit Translation
Ayush Maheshwari
Ashim Gupta
Amrith Krishna
Atul Kumar Singh
Ganesh Ramakrishnan
G. Anil Kumar
Jitin Singla
22
0
0
23 May 2023
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual
  Pretrained Language Models
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
Peiqin Lin
Chengzhi Hu
Zheyu Zhang
André F. T. Martins
Hinrich Schütze
27
1
0
23 May 2023
Language Model Tokenizers Introduce Unfairness Between Languages
Language Model Tokenizers Introduce Unfairness Between Languages
Aleksandar Petrov
Emanuele La Malfa
Philip H. S. Torr
Adel Bibi
16
96
0
17 May 2023
What is the best recipe for character-level encoder-only modelling?
What is the best recipe for character-level encoder-only modelling?
Kris Cao
32
2
0
09 May 2023
Investigating Lexical Sharing in Multilingual Machine Translation for
  Indian Languages
Investigating Lexical Sharing in Multilingual Machine Translation for Indian Languages
Sonal Sannigrahi
Rachel Bawden
29
0
0
04 May 2023
Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on
  POS Tagging for Non-Standardized Languages
Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages
Verena Blaschke
Hinrich Schütze
Barbara Plank
32
14
0
20 Apr 2023
An Information Extraction Study: Take In Mind the Tokenization!
An Information Extraction Study: Take In Mind the Tokenization!
Christos Theodoropoulos
Marie-Francine Moens
19
6
0
27 Mar 2023
Fine-Tashkeel: Finetuning Byte-Level Models for Accurate Arabic Text
  Diacritization
Fine-Tashkeel: Finetuning Byte-Level Models for Accurate Arabic Text Diacritization
Bashar Al-Rfooh
Gheith A. Abandah
Rami Al-Rfou
24
4
0
25 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
17
41
0
10 Mar 2023
RETVec: Resilient and Efficient Text Vectorizer
RETVec: Resilient and Efficient Text Vectorizer
Elie Bursztein
Marina Zhang
Owen Vallis
Xinyu Jia
Alexey Kurakin
VLM
24
4
0
18 Feb 2023
Distillation of encoder-decoder transformers for sequence labelling
Distillation of encoder-decoder transformers for sequence labelling
M. Farina
D. Pappadopulo
Anant Gupta
Leslie Huang
Ozan Irsoy
Thamar Solorio
VLM
82
3
0
10 Feb 2023
Truveta Mapper: A Zero-shot Ontology Alignment Framework
Truveta Mapper: A Zero-shot Ontology Alignment Framework
Mariyam Amir
Murchana Baruah
Mahsa Eslamialishah
Sina Ehsani
Alireza Bahramali
Sadra Naddaf-sh
Saman Zarandioon
25
7
0
24 Jan 2023
ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free
  Language Models
ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models
Jonas Belouadi
Steffen Eger
44
24
0
20 Dec 2022
Inducing Character-level Structure in Subword-based Language Models with
  Type-level Interchange Intervention Training
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Jing-ling Huang
Zhengxuan Wu
Kyle Mahowald
Christopher Potts
19
13
0
19 Dec 2022
DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue
DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue
William B. Held
Christopher Hidey
Fei Liu
Eric Zhu
Rahul Goel
Diyi Yang
Rushin Shah
13
0
0
15 Dec 2022
Advancing Multilingual Pre-training: TRIP Triangular Document-level
  Pre-training for Multilingual Language Models
Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models
Hongyuan Lu
Haoyang Huang
Shuming Ma
Dongdong Zhang
W. Lam
Furu Wei
22
4
0
15 Dec 2022
Efficient Transformers with Dynamic Token Pooling
Efficient Transformers with Dynamic Token Pooling
Piotr Nawrot
J. Chorowski
Adrian Lañcucki
E. Ponti
6
42
0
17 Nov 2022
12
Next