Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1808.06226
Cited By
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
19 August 2018
Taku Kudo
John Richardson
Re-assign community
ArXiv (abs)
PDF
HTML
Github (10925★)
Papers citing
"SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing"
50 / 2,064 papers shown
AutoNMT: A Framework to Streamline the Research of Seq2Seq Models
Salvador Carrión
F. Casacuberta
80
3
0
09 Feb 2023
Measuring The Impact Of Programming Language Distribution
International Conference on Machine Learning (ICML), 2023
Gabriel Orlanski
Kefan Xiao
Xavier Garcia
Jeffrey Hui
Joshua Howland
J. Malmaud
Jacob Austin
Rishah Singh
Michele Catasta
449
46
0
03 Feb 2023
The unreasonable effectiveness of few-shot learning for machine translation
International Conference on Machine Learning (ICML), 2023
Xavier Garcia
Yamini Bansal
Colin Cherry
George F. Foster
M. Krikun
Fan Feng
Melvin Johnson
Orhan Firat
320
125
0
02 Feb 2023
KNNs of Semantic Encodings for Rating Prediction
International Conference on Communications in Computing (ICCC), 2023
Leo Laugier
Raghuram Vadapalli
Thomas Bonald
Lucas Dixon
90
2
0
01 Feb 2023
Adaptive Machine Translation with Large Language Models
European Association for Machine Translation Conferences/Workshops (EAMT), 2023
Yasmin Moslem
Rejwanul Haque
John D. Kelleher
Andy Way
AI4CE
289
109
0
30 Jan 2023
Adaptive Computation with Elastic Input Sequence
International Conference on Machine Learning (ICML), 2023
Fuzhao Xue
Valerii Likhosherstov
Anurag Arnab
N. Houlsby
Mostafa Dehghani
Yang You
244
27
0
30 Jan 2023
Pre-training for Speech Translation: CTC Meets Optimal Transport
International Conference on Machine Learning (ICML), 2023
Hang Le
Hongyu Gong
Changhan Wang
J. Pino
Benjamin Lecouteux
D. Schwab
OT
379
30
0
27 Jan 2023
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
233
3
0
26 Jan 2023
XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Davis Liang
Hila Gonen
Yuning Mao
Rui Hou
Naman Goyal
Marjan Ghazvininejad
Luke Zettlemoyer
Madian Khabsa
268
101
0
25 Jan 2023
Ensemble Transfer Learning for Multilingual Coreference Resolution
T. Lai
Heng Ji
160
3
0
22 Jan 2023
REDAffectiveLM: Leveraging Affect Enriched Embedding and Transformer-based Neural Language Model for Readers' Emotion Detection
Knowledge and Information Systems (KAIS), 2023
Anoop Kadan
Deepak P
Manjary P.Gangan
Savitha Sam Abraham
L. LajishV.
277
1
0
21 Jan 2023
Language Agnostic Data-Driven Inverse Text Normalization
Interspeech (Interspeech), 2023
Szu-Jui Chen
Debjyoti Paul
Yutong Pang
Peng Su
Xuedong Zhang
101
1
0
20 Jan 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Will Rieger
BDL
UQCV
125
0
0
16 Jan 2023
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition
David M. Chan
Shalini Ghosh
Ariya Rastrow
Björn Hoffmeister
OffRL
204
7
0
06 Jan 2023
HIT-SCIR at MMNLU-22: Consistency Regularization for Multilingual Spoken Language Understanding
Bo Zheng
Zhouyang Li
Fuxuan Wei
Qiguang Chen
Libo Qin
Wanxiang Che
153
4
0
05 Jan 2023
Audio-Visual Efficient Conformer for Robust Speech Recognition
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Maxime Burchi
Radu Timofte
VLM
214
50
0
04 Jan 2023
Cramming: Training a Language Model on a Single GPU in One Day
International Conference on Machine Learning (ICML), 2022
Jonas Geiping
Tom Goldstein
MoE
270
103
0
28 Dec 2022
Optimizing Deep Transformers for Chinese-Thai Low-Resource Translation
Wenjie Hao
Hongfei Xu
Lingling Mu
Hongying Zan
MoE
272
4
0
24 Dec 2022
Pushing the performances of ASR models on English and Spanish accents
Pooja Chitkara
M. Rivière
Jade Copet
Frank Zhang
Yatharth Saraf
209
0
0
22 Dec 2022
Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models
Najoung Kim
Tal Linzen
P. Smolensky
220
34
0
21 Dec 2022
ORCA: A Challenging Benchmark for Arabic Language Understanding
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
AbdelRahim Elmadany
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
ELM
298
60
0
21 Dec 2022
Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
John Wieting
J. Clark
William W. Cohen
Graham Neubig
Taylor Berg-Kirkpatrick
284
6
0
21 Dec 2022
Mini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow Training
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Kelly Marchisio
Patrick Lewis
Yihong Chen
Mikel Artetxe
265
28
0
20 Dec 2022
ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Jonas Belouadi
Steffen Eger
322
30
0
20 Dec 2022
Little Red Riding Hood Goes Around the Globe:Crosslingual Story Planning and Generation with Large Language Models
International Conference on Language Resources and Evaluation (LREC), 2022
E. Razumovskaia
Joshua Maynez
Annie Louis
Mirella Lapata
Shashi Narayan
LRM
220
5
0
20 Dec 2022
SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers
Hongyi Yuan
Zheng Yuan
Chuanqi Tan
Fei Huang
Songfang Huang
DiffM
241
83
0
20 Dec 2022
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Jian Yang
Shuming Ma
Li Dong
Shaohan Huang
Haoyang Huang
Yuwei Yin
Dongdong Zhang
Liqun Yang
Furu Wei
Zhoujun Li
SyDa
AI4CE
188
27
0
20 Dec 2022
A Survey on Pretrained Language Models for Neural Code Intelligence
Yichen Xu
Yanqiao Zhu
159
19
0
20 Dec 2022
Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Mozhdeh Gheini
Tatiana Likhomanenko
Matthias Sperber
Hendra Setiawan
231
7
0
20 Dec 2022
Tokenization Consistency Matters for Generative Models on Extractive NLP Tasks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kaiser Sun
Peng Qi
Yuhao Zhang
Lan Liu
William Yang Wang
Zhiheng Huang
163
10
0
19 Dec 2022
Synthetic Pre-Training Tasks for Neural Machine Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Zexue He
Graeme W. Blackwood
Yikang Shen
Julian McAuley
Rogerio Feris
248
6
0
19 Dec 2022
(Psycho-)Linguistic Features Meet Transformer Models for Improved Explainable and Controllable Text Simplification
Yu Qiao
Xiaofei Li
Daniel Wiechmann
E. Kerz
195
5
0
19 Dec 2022
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Ioannis Tsiamas
José A. R. Fonollosa
Marta R. Costa-jussá
288
6
0
19 Dec 2022
A Natural Bias for Language Generation Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Clara Meister
Wojciech Stokowiec
Tiago Pimentel
Lei Yu
Laura Rimell
A. Kuncoro
MILM
183
6
0
19 Dec 2022
Large Language Models Meet NL2Code: A Survey
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Daoguang Zan
B. Chen
Fengji Zhang
Di Lu
Bingchao Wu
Bei Guan
Yongji Wang
Jian-Guang Lou
ELM
ALM
240
239
0
19 Dec 2022
WACO: Word-Aligned Contrastive Learning for Speech Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Siqi Ouyang
Rong Ye
Lei Li
336
34
0
19 Dec 2022
AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Xingshan Zeng
Liangyou Li
Qun Liu
157
6
0
17 Dec 2022
Controlling Styles in Neural Machine Translation with Activation Prompt
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Yifan Wang
Zewei Sun
Shanbo Cheng
Weiguo Zheng
Mingxuan Wang
240
10
0
17 Dec 2022
Planting and Mitigating Memorized Content in Predictive-Text Language Models
C.M. Downey
Wei Dai
Huseyin A. Inan
Kim Laine
Saurabh Naik
Tomasz Religa
PILM
92
2
0
16 Dec 2022
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Hirofumi Inaguma
Sravya Popuri
Ilia Kulikov
Peng-Jen Chen
Changhan Wang
Yu-An Chung
Yun Tang
Ann Lee
Shinji Watanabe
J. Pino
320
77
0
15 Dec 2022
CLIPPO: Image-and-Language Understanding from Pixels Only
Computer Vision and Pattern Recognition (CVPR), 2022
Michael Tschannen
Basil Mustafa
N. Houlsby
CLIP
VLM
343
72
0
15 Dec 2022
Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models
Hongyuan Lu
Haoyang Huang
Shuming Ma
Dongdong Zhang
W. Lam
Furu Wei
205
4
0
15 Dec 2022
Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Maha Elbayad
Anna Y. Sun
Shruti Bhosale
MoE
167
15
0
15 Dec 2022
Causes and Cures for Interference in Multilingual Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Uri Shaham
Maha Elbayad
Vedanuj Goswami
Omer Levy
Shruti Bhosale
308
32
0
14 Dec 2022
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Yekun Chai
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua Wu
238
42
0
13 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw Data
International Conference on Learning Representations (ICLR), 2022
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
Maja Pantic
SSL
309
70
0
12 Dec 2022
M3ST: Mix at Three Levels for Speech Translation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xuxin Cheng
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Yuexian Zou
293
40
0
07 Dec 2022
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2022
Yuchao Gu
Xintao Wang
Yixiao Ge
Ying Shan
Xiaohu Qie
Mike Zheng Shou
DiffM
219
30
0
06 Dec 2022
Document-Level Abstractive Summarization
Gonçalo Raposo
Afonso Raposo
Ana Sofia Carmo
126
3
0
06 Dec 2022
LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Yuguang Yang
Yu Pan
Jingjing Yin
Heng Lu
252
4
0
05 Dec 2022
Previous
1
2
3
...
20
21
22
...
40
41
42
Next
Page 21 of 42
Page
of 42
Go