Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1512.00103
Cited By
Multilingual Language Processing From Bytes
1 December 2015
D. Gillick
Clifford Brunk
Oriol Vinyals
A. Subramanya
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multilingual Language Processing From Bytes"
50 / 95 papers shown
Title
MorphBPE: A Morpho-Aware Tokenizer Bridging Linguistic Complexity for Efficient LLM Training Across Morphologies
Ehsaneddin Asgari
Yassine El Kheir
Mohammad Ali Sadraei Javaheri
58
0
0
02 Feb 2025
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
37
2
0
28 Oct 2024
Label Alignment and Reassignment with Generalist Large Language Model for Enhanced Cross-Domain Named Entity Recognition
Ke Bao
Chonghuan Yang
27
0
0
24 Jul 2024
Optimizing Byte-level Representation for End-to-end ASR
Roger Hsiao
Liuhui Deng
Erik McDermott
R. Travadi
Xiaodan Zhuang
21
0
0
14 Jun 2024
Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
Elizabeth Salesky
Neha Verma
Philipp Koehn
Matt Post
19
14
0
23 May 2023
Language-universal phonetic encoder for low-resource speech recognition
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
31
2
0
19 May 2023
What is the best recipe for character-level encoder-only modelling?
Kris Cao
32
2
0
09 May 2023
TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags
Jiang-Dong Liu
Donghong Ji
Jingye Li
Dongdong Xie
Chong Teng
Liang Zhao
Fei Li
19
15
0
01 Nov 2022
A multi-level interpretable sleep stage scoring system by infusing experts' knowledge into a deep network architecture
H. Niknazar
S. Mednick
16
4
0
11 Jul 2022
Bilingual End-to-End ASR with Byte-Level Subwords
Liuhui Deng
Roger Hsiao
Arnab Ghoshal
13
4
0
01 May 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
40
98
0
24 Mar 2022
Unified Named Entity Recognition as Word-Word Relation Classification
Jingye Li
Hao Fei
Jiang-Dong Liu
Shengqiong Wu
Meishan Zhang
Chong Teng
Donghong Ji
Fei Li
23
241
0
19 Dec 2021
ML Based Lineage in Databases
Michael Leybovich
O. Shmueli
AI4TS
22
2
0
13 Sep 2021
Active Learning for Massively Parallel Translation of Constrained Text into Low Resource Languages
Zhong Zhou
A. Waibel
12
5
0
16 Aug 2021
byteSteady: Fast Classification Using Byte-Level n-Gram Embeddings
Xiang Zhang
Alexandre Drouin
Raymond Li
14
1
0
24 Jun 2021
Dutch Named Entity Recognition and De-identification Methods for the Human Resource Domain
C. V. Toledo
F. V. Dijk
Marco Spruit
11
3
0
04 Jun 2021
A Unified Generative Framework for Various NER Subtasks
Hang Yan
Tao Gui
Junqi Dai
Qipeng Guo
Zheng-Wei Zhang
Xipeng Qiu
6
288
0
02 Jun 2021
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
27
464
0
28 May 2021
Towards A Multi-agent System for Online Hate Speech Detection
Gaurav Sahu
R. Cohen
Olga Vechtomova
11
9
0
03 May 2021
Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation
Zhong Zhou
Alexander Waibel
14
4
0
12 Apr 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
27
210
0
11 Mar 2021
Recent Trends in Named Entity Recognition (NER)
Aryan Roy
23
37
0
25 Jan 2021
Training Multilingual Pre-trained Language Model with Byte-level Subwords
Junqiu Wei
Qun Liu
Yinpeng Guo
Xin Jiang
25
19
0
23 Jan 2021
Global Attention for Name Tagging
Boliang Zhang
Spencer Whitehead
Lifu Huang
Heng Ji
45
17
0
19 Oct 2020
Knowledge Efficient Deep Learning for Natural Language Processing
Hai Wang
12
2
0
28 Aug 2020
Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining
T. Tsai
Kevin Ji
VLM
16
17
0
29 Jul 2020
Sources of Transfer in Multilingual Named Entity Recognition
David Mueller
Nicholas Andrews
Mark Dredze
20
20
0
02 May 2020
Bootstrapping NLU Models with Multi-task Learning
Shubham Kapoor
C. Tirkaz
9
3
0
15 Nov 2019
Using Interlinear Glosses as Pivot in Low-Resource Multilingual Machine Translation
Zhong Zhou
Lori S. Levin
David R. Mortensen
A. Waibel
21
10
0
07 Nov 2019
Hierarchical Contextualized Representation for Named Entity Recognition
Ying Luo
Fengshun Xiao
Zhao Hai
27
129
0
06 Nov 2019
A Survey on Recent Advances in Named Entity Recognition from Deep Learning models
Vikas Yadav
Steven Bethard
3DV
11
588
0
25 Oct 2019
Improving Pre-Trained Multilingual Models with Vocabulary Expansion
Hai Wang
Dian Yu
Kai Sun
Jianshu Chen
Dong Yu
28
40
0
26 Sep 2019
Neural Correction Model for Open-Domain Named Entity Recognition
Mengdi Zhu
Zheye Deng
Wenhan Xiong
Mo Yu
Ming Zhang
William Yang Wang
21
6
0
13 Sep 2019
Neural Machine Translation with Byte-Level Subwords
Changhan Wang
Kyunghyun Cho
Jiatao Gu
18
172
0
07 Sep 2019
A Morpho-Syntactically Informed LSTM-CRF Model for Named Entity Recognition
L. Simeonova
K. Simov
P. Osenova
Preslav Nakov
12
8
0
27 Aug 2019
Neural Architectures for Nested NER through Linearization
Jana Straková
Milan Straka
Jan Hajic
8
246
0
19 Aug 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
N. Arivazhagan
Ankur Bapna
Orhan Firat
Dmitry Lepikhin
Melvin Johnson
...
George F. Foster
Colin Cherry
Wolfgang Macherey
Z. Chen
Yonghui Wu
23
422
0
11 Jul 2019
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented Text
Michael Hahn
Marco Baroni
LMTD
22
15
0
17 Jun 2019
Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
Shengfei Lyu
Linghao Sun
Huixiong Yi
Yong-jin Liu
Huanhuan Chen
Chun Miao
16
0
0
04 Jun 2019
Sentiment Tagging with Partial Labels using Modular Architectures
Xiao Zhang
Dan Goldwasser
12
9
0
03 Jun 2019
Effective Context and Fragment Feature Usage for Named Entity Recognition
Nargiza Nosirova
Mingbin Xu
Hui Jiang
14
0
0
05 Apr 2019
Measuring scheduling efficiency of RNNs for NLP applications
Urmish Thakker
Ganesh S. Dasika
Jesse G. Beu
Matthew Mattina
6
13
0
05 Apr 2019
A Multi-task Learning Approach for Named Entity Recognition using Local Detection
Nargiza Nosirova
Mingbin Xu
Hui Jiang
16
2
0
05 Apr 2019
COMIC: Towards A Compact Image Captioning Model with Attention
J. Tan
Chee Seng Chan
Joon Huang Chuah
VLM
20
40
0
04 Mar 2019
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Bo-wen Li
Yu Zhang
Tara N. Sainath
Yonghui Wu
William Chan
AuLLM
11
129
0
22 Nov 2018
Neural CRF transducers for sequence labeling
Kai-Mo Hu
Zhijian Ou
Min Hu
Junlan Feng
14
5
0
04 Nov 2018
Chargrid: Towards Understanding 2D Documents
Anoop R. Katti
C. Reisswig
Cordula Guder
Sebastian Brarda
S. Bickel
Johannes Höhne
Jean Baptiste Faddoul
20
191
0
24 Sep 2018
A Byte-sized Approach to Named Entity Recognition
Emily Sheng
Premkumar Natarajan
17
0
0
22 Sep 2018
Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training
Peng-Tao Xu
Andrea Madotto
Chien-Sheng Wu
Ji Ho Park
Pascale Fung
16
68
0
12 Sep 2018
Paraphrases as Foreign Languages in Multilingual Neural Machine Translation
Zhong Zhou
Matthias Sperber
A. Waibel
LRM
17
19
0
25 Aug 2018
1
2
Next