Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.13626
Cited By
ByT5: Towards a token-free future with pre-trained byte-to-byte models
28 May 2021
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ByT5: Towards a token-free future with pre-trained byte-to-byte models"
45 / 95 papers shown
Title
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Jing-ling Huang
Zhengxuan Wu
Kyle Mahowald
Christopher Potts
19
13
0
19 Dec 2022
DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue
William B. Held
Christopher Hidey
Fei Liu
Eric Zhu
Rahul Goel
Diyi Yang
Rushin Shah
21
0
0
15 Dec 2022
Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models
Hongyuan Lu
Haoyang Huang
Shuming Ma
Dongdong Zhang
W. Lam
Furu Wei
22
4
0
15 Dec 2022
Efficient Transformers with Dynamic Token Pooling
Piotr Nawrot
J. Chorowski
Adrian Lañcucki
E. Ponti
8
42
0
17 Nov 2022
A Benchmark and Dataset for Post-OCR text correction in Sanskrit
Ayush Maheshwari
Nikhil Singh
Amrith Krishna
Ganesh Ramakrishnan
26
12
0
15 Nov 2022
Local Structure Matters Most in Most Languages
Louis Clouâtre
Prasanna Parthasarathi
Amal Zouaq
Sarath Chandar
26
1
0
09 Nov 2022
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Jiatong Shi
Chan-Jan Hsu
Ho-Lam Chung
Dongji Gao
Leibny Paola García-Perera
Shinji Watanabe
Ann Lee
Hung-yi Lee
29
12
0
06 Nov 2022
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
Chan-Jan Hsu
Ho-Lam Chung
Hung-yi Lee
Yu Tsao
19
6
0
01 Nov 2022
Graphemic Normalization of the Perso-Arabic Script
R. Doctor
Alexander Gutkin
Cibu Johny
Brian Roark
R. Sproat
36
4
0
21 Oct 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
56
2,986
0
20 Oct 2022
Incorporating Context into Subword Vocabularies
Shaked Yehezkel
Yuval Pinter
39
8
0
13 Oct 2022
One does not fit all! On the Complementarity of Vision Encoders for Vision and Language Tasks
Gregor Geigle
Chen Cecilia Liu
Jonas Pfeiffer
Iryna Gurevych
VLM
26
1
0
12 Oct 2022
Non-Axiomatic Term Logic: A Computational Theory of Cognitive Symbolic Reasoning
Kotaro Funakoshi
NAI
19
1
0
12 Oct 2022
MonoByte: A Pool of Monolingual Byte-level Language Models
Hugo Queiroz Abonizio
Leandro Rodrigues de Souza
R. Lotufo
Rodrigo Nogueira
23
1
0
22 Sep 2022
Layer or Representation Space: What makes BERT-based Evaluation Metrics Robust?
Doan Nam Long Vu
N. Moosavi
Steffen Eger
14
9
0
06 Sep 2022
CLOWER: A Pre-trained Language Model with Contrastive Learning over Word and Character Representations
Borun Chen
Hongyin Tang
Jiahao Bu
Kai Zhang
Jingang Wang
Qifan Wang
Haitao Zheng
Wei Yu Wu
Liqian Yu
VLM
20
1
0
23 Aug 2022
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
30
46
0
14 Jul 2022
Lifting the Curse of Multilinguality by Pre-training Modular Transformers
Jonas Pfeiffer
Naman Goyal
Xi Victoria Lin
Xian Li
James Cross
Sebastian Riedel
Mikel Artetxe
LRM
40
138
0
12 May 2022
UL2: Unifying Language Learning Paradigms
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
Xavier Garcia
Jason W. Wei
...
Tal Schuster
H. Zheng
Denny Zhou
N. Houlsby
Donald Metzler
AI4CE
57
294
0
10 May 2022
How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?
Shiyue Zhang
Vishrav Chaudhary
Naman Goyal
James Cross
Guillaume Wenzek
Mohit Bansal
Francisco Guzman
31
16
0
29 Apr 2022
Impact of Tokenization on Language Models: An Analysis for Turkish
Cagri Toraman
E. Yilmaz
Furkan Şahinuç
Oguzhan Ozcelik
30
74
0
19 Apr 2022
A Hierarchical N-Gram Framework for Zero-Shot Link Prediction
Mingchen Li
J. Chen
Samuel Mensah
Nikolaos Aletras
Xiulong Yang
Yang Ye
10
13
0
16 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
61
800
0
14 Apr 2022
ByT5 model for massively multilingual grapheme-to-phoneme conversion
Jian Zhu
Cong Zhang
David Jurgens
11
36
0
06 Apr 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
38
98
0
24 Mar 2022
IT5: Text-to-text Pretraining for Italian Language Understanding and Generation
Gabriele Sarti
Malvina Nissim
AILaw
8
42
0
07 Mar 2022
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers
Alyssa Lees
Vinh Q. Tran
Yi Tay
Jeffrey Scott Sorensen
Jai Gupta
Donald Metzler
Lucy Vasserman
25
173
0
22 Feb 2022
Correcting diacritics and typos with a ByT5 transformer model
Lukas Stankevicius
M. Lukoševičius
J. Kapočiūtė-Dzikienė
Monika Briediene
Tomas Krilavičius
11
20
0
31 Jan 2022
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Sabrina J. Mielke
Zaid Alyafeai
Elizabeth Salesky
Colin Raffel
Manan Dey
...
Arun Raja
Chenglei Si
Wilson Y. Lee
Benoît Sagot
Samson Tan
23
140
0
20 Dec 2021
ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5
David Samuel
Milan Straka
10
15
0
28 Oct 2021
Deciphering the Language of Nature: A transformer-based language model for deleterious mutations in proteins
Theodore Jiang
Li Fang
Kai Wang
MedIm
25
17
0
27 Oct 2021
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
32
98
0
25 Oct 2021
Why don't people use character-level machine translation?
Jindrich Libovický
Helmut Schmid
Alexander M. Fraser
63
28
0
15 Oct 2021
Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings
Kalpesh Krishna
Deepak Nathani
Xavier Garcia
Bidisha Samanta
Partha P. Talukdar
32
24
0
14 Oct 2021
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran
Duong Minh Le
Dat Quoc Nguyen
19
51
0
20 Sep 2021
Single-Read Reconstruction for DNA Data Storage Using Transformers
Yotam Nahum
Eyar Ben-Tolila
Leon Anavy
66
5
0
12 Sep 2021
Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer
Iulia Turc
Kenton Lee
Jacob Eisenstein
Ming-Wei Chang
Kristina Toutanova
24
58
0
30 Jun 2021
Evaluating Various Tokenizers for Arabic Text Classification
Zaid Alyafeai
Maged S. Al-Shaibani
Mustafa Ghaleb
Irfan Ahmad
15
41
0
14 Jun 2021
Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Noam Wies
Yoav Levine
Daniel Jannai
Amnon Shashua
40
20
0
09 May 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
246
285
0
02 Feb 2021
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
Hicham El Boukkouri
Olivier Ferret
Thomas Lavergne
Hiroshi Noji
Pierre Zweigenbaum
Junichi Tsujii
66
156
0
20 Oct 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,460
0
23 Jan 2020
MLQA: Evaluating Cross-lingual Extractive Question Answering
Patrick Lewis
Barlas Oğuz
Ruty Rinott
Sebastian Riedel
Holger Schwenk
ELM
244
491
0
16 Oct 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,743
0
26 Sep 2016
Previous
1
2