ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.06226
  4. Cited By
SentencePiece: A simple and language independent subword tokenizer and
  detokenizer for Neural Text Processing

SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing

19 August 2018
Taku Kudo
John Richardson
ArXiv (abs)PDFHTMLGithub (10925★)

Papers citing "SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing"

50 / 2,061 papers shown
Title
Controllable Sentence Simplification
Controllable Sentence SimplificationInternational Conference on Language Resources and Evaluation (LREC), 2019
Louis Martin
Benoît Sagot
Eric Villemonte de la Clergerie
Antoine Bordes
187
158
0
07 Oct 2019
Improving Word Embedding Factorization for Compression Using Distilled
  Nonlinear Neural Decomposition
Improving Word Embedding Factorization for Compression Using Distilled Nonlinear Neural Decomposition
Vasileios Lioutas
Ahmad Rashid
Krtin Kumar
Md. Akmal Haidar
Mehdi Rezagholizadeh
201
9
0
02 Oct 2019
Simple and Effective Paraphrastic Similarity from Parallel Translations
Simple and Effective Paraphrastic Similarity from Parallel TranslationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
John Wieting
Kevin Gimpel
Graham Neubig
Taylor Berg-Kirkpatrick
204
51
0
30 Sep 2019
The Source-Target Domain Mismatch Problem in Machine Translation
The Source-Target Domain Mismatch Problem in Machine TranslationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2019
Jiajun Shen
Peng-Jen Chen
Matt Le
Junxian He
Jiatao Gu
Myle Ott
Michael Auli
MarcÁurelio Ranzato
206
26
0
28 Sep 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
ALBERT: A Lite BERT for Self-supervised Learning of Language RepresentationsInternational Conference on Learning Representations (ICLR), 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSLAIMat
1.1K
7,069
0
26 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
233
277
0
23 Sep 2019
Self-Training for End-to-End Speech Recognition
Self-Training for End-to-End Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Jacob Kahn
Ann Lee
Awni Y. Hannun
SSL
204
249
0
19 Sep 2019
Simple, Scalable Adaptation for Neural Machine Translation
Simple, Scalable Adaptation for Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Ankur Bapna
N. Arivazhagan
Orhan Firat
AI4CE
294
440
0
18 Sep 2019
Harnessing Indirect Training Data for End-to-End Automatic Speech
  Translation: Tricks of the Trade
Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade
J. Pino
Liezl Puzon
Jiatao Gu
Xutai Ma
Arya D. McCarthy
D. Gopinath
115
3
0
14 Sep 2019
A Comparative Study on Transformer vs RNN in Speech Applications
A Comparative Study on Transformer vs RNN in Speech ApplicationsAutomatic Speech Recognition & Understanding (ASRU), 2019
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
Hirofumi Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
252
778
0
13 Sep 2019
Interactive Fiction Games: A Colossal Adventure
Interactive Fiction Games: A Colossal AdventureAAAI Conference on Artificial Intelligence (AAAI), 2019
Matthew J. Hausknecht
Prithviraj Ammanabrolu
Marc-Alexandre Côté
Xingdi Yuan
LLMAGLM&RoAI4CE
322
222
0
11 Sep 2019
Transfer Learning Robustness in Multi-Class Categorization by
  Fine-Tuning Pre-Trained Contextualized Language Models
Transfer Learning Robustness in Multi-Class Categorization by Fine-Tuning Pre-Trained Contextualized Language Models
Xinyi Liu
A. Wangperawong
175
4
0
08 Sep 2019
Neural Machine Translation with Byte-Level Subwords
Neural Machine Translation with Byte-Level SubwordsAAAI Conference on Artificial Intelligence (AAAI), 2019
Changhan Wang
Dong Wang
Jiatao Gu
189
200
0
07 Sep 2019
Investigating Multilingual NMT Representations at Scale
Investigating Multilingual NMT Representations at ScaleConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Sneha Kudugunta
Ankur Bapna
Isaac Caswell
N. Arivazhagan
Orhan Firat
LRM
332
130
0
05 Sep 2019
Subword Language Model for Query Auto-Completion
Subword Language Model for Query Auto-CompletionConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Gyuwan Kim
121
16
0
02 Sep 2019
Evaluating the Cross-Lingual Effectiveness of Massively Multilingual
  Neural Machine Translation
Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine TranslationAAAI Conference on Artificial Intelligence (AAAI), 2019
Aditya Siddhant
Melvin Johnson
Henry Tsai
N. Arivazhagan
Jason Riesa
Ankur Bapna
Orhan Firat
Karthik Raman
148
73
0
01 Sep 2019
Repurposing Decoder-Transformer Language Models for Abstractive
  Summarization
Repurposing Decoder-Transformer Language Models for Abstractive Summarization
Luke de Oliveira
Alfredo Láinez Rodrigo
86
5
0
01 Sep 2019
Differentiable Product Quantization for End-to-End Embedding Compression
Differentiable Product Quantization for End-to-End Embedding CompressionInternational Conference on Machine Learning (ICML), 2019
Ting Chen
Lala Li
Luke Huan
MQ
186
75
0
26 Aug 2019
uniblock: Scoring and Filtering Corpus with Unicode Block Information
uniblock: Scoring and Filtering Corpus with Unicode Block InformationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Yingbo Gao
Weiyue Wang
Hermann Ney
109
1
0
26 Aug 2019
Denoising based Sequence-to-Sequence Pre-training for Text Generation
Denoising based Sequence-to-Sequence Pre-training for Text GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Liang Wang
Wei Zhao
Ruoyu Jia
Sujian Li
Jingming Liu
VLMAI4CE
155
40
0
22 Aug 2019
Prosodic Phrase Alignment for Machine Dubbing
Prosodic Phrase Alignment for Machine DubbingInterspeech (Interspeech), 2019
A. Oktem
Mireia Farrús
Antonio Bonafonte
143
32
0
20 Aug 2019
Latent-Variable Non-Autoregressive Neural Machine Translation with
  Deterministic Inference Using a Delta Posterior
Latent-Variable Non-Autoregressive Neural Machine Translation with Deterministic Inference Using a Delta PosteriorAAAI Conference on Artificial Intelligence (AAAI), 2019
Raphael Shu
Jason D. Lee
Hideki Nakayama
Dong Wang
BDL
391
122
0
20 Aug 2019
English-Czech Systems in WMT19: Document-Level Transformer
English-Czech Systems in WMT19: Document-Level TransformerConference on Machine Translation (WMT), 2019
Martin Popel
Dominik Machácek
Michal Auersperger
Ondrej Bojar
Pavel Pecina
122
22
0
30 Jul 2019
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Leveraging Pre-trained Checkpoints for Sequence Generation TasksTransactions of the Association for Computational Linguistics (TACL), 2019
S. Rothe
Shashi Narayan
Aliaksei Severyn
SILM
306
457
0
29 Jul 2019
Supervised and Unsupervised Neural Approaches to Text Readability
Supervised and Unsupervised Neural Approaches to Text ReadabilityInternational Conference on Computational Logic (ICCL), 2019
Matej Martinc
Senja Pollak
Marko Robnik-Šikonja
423
160
0
26 Jul 2019
Naver Labs Europe's Systems for the WMT19 Machine Translation Robustness
  Task
Naver Labs Europe's Systems for the WMT19 Machine Translation Robustness TaskConference on Machine Translation (WMT), 2019
Alexandre Berard
Ioan Calapodescu
Claude Roux
VLM
139
62
0
15 Jul 2019
Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level
  Neural Machine Translation
Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine TranslationConference on Machine Translation (WMT), 2019
Marcin Junczys-Dowmunt
158
167
0
14 Jul 2019
The University of Edinburgh's Submissions to the WMT19 News Translation
  Task
The University of Edinburgh's Submissions to the WMT19 News Translation TaskConference on Machine Translation (WMT), 2019
Rachel Bawden
Nikolay Bogoychev
Ulrich Germann
Roman Grundkiewicz
Faheem Kirefu
Antonio Valerio Miceli Barone
Alexandra Birch
109
33
0
12 Jul 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings
  and Challenges
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
N. Arivazhagan
Ankur Bapna
Orhan Firat
Dmitry Lepikhin
Melvin Johnson
...
George F. Foster
Colin Cherry
Wolfgang Macherey
Zhiwen Chen
Yonghui Wu
219
447
0
11 Jul 2019
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from
  Wikipedia
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from WikipediaConference of the European Chapter of the Association for Computational Linguistics (EACL), 2019
Holger Schwenk
Vishrav Chaudhary
Shuo Sun
Hongyu Gong
Francisco Guzmán
CVBM
347
423
0
10 Jul 2019
ReQA: An Evaluation for End-to-End Answer Retrieval Models
ReQA: An Evaluation for End-to-End Answer Retrieval ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Amin Ahmad
Noah Constant
Yinfei Yang
Daniel Cer
RALM
230
55
0
10 Jul 2019
Multilingual Universal Sentence Encoder for Semantic Retrieval
Multilingual Universal Sentence Encoder for Semantic RetrievalAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Yinfei Yang
Daniel Cer
Amin Ahmad
Mandy Guo
Jax Law
...
Steve Yuan
Chris Tar
Yun-hsuan Sung
B. Strope
R. Kurzweil
3DV
217
515
0
09 Jul 2019
NTT's Machine Translation Systems for WMT19 Robustness Task
NTT's Machine Translation Systems for WMT19 Robustness TaskConference on Machine Translation (WMT), 2019
Soichiro Murakami
Makoto Morishita
Tsutomu Hirao
Masaaki Nagata
VLM
134
9
0
09 Jul 2019
Applying a Pre-trained Language Model to Spanish Twitter Humor
  Prediction
Applying a Pre-trained Language Model to Spanish Twitter Humor Prediction
Bobak Farzin
Piotr Czapla
Jeremy Howard
88
7
0
06 Jul 2019
How we do things with words: Analyzing text as social and cultural data
How we do things with words: Analyzing text as social and cultural dataFrontiers in Artificial Intelligence (FAI), 2019
D. Nguyen
Maria Liakata
Simon DeDeo
Jacob Eisenstein
David M. Mimno
Rebekah Tromble
J. Winters
167
96
0
02 Jul 2019
A Neural Grammatical Error Correction System Built On Better
  Pre-training and Sequential Transfer Learning
A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning
Yo Joong Choe
Jiyeon Ham
Kyubyong Park
Yeoil Yoon
122
88
0
02 Jul 2019
Findings of the First Shared Task on Machine Translation Robustness
Findings of the First Shared Task on Machine Translation RobustnessConference on Machine Translation (WMT), 2019
Xian Li
Paul Michel
Antonios Anastasopoulos
Yonatan Belinkov
Nadir Durrani
Philipp Koehn
Philipp Koehn
Graham Neubig
J. Pino
Hassan Sajjad
191
64
0
27 Jun 2019
Conversational Response Re-ranking Based on Event Causality and Role
  Factored Tensor Event Embedding
Conversational Response Re-ranking Based on Event Causality and Role Factored Tensor Event Embedding
Shohei Tanaka
Koichiro Yoshino
Katsuhito Sudoh
Satoshi Nakamura
139
4
0
24 Jun 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language UnderstandingNeural Information Processing Systems (NeurIPS), 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
869
9,044
0
19 Jun 2019
A Focus on Neural Machine Translation for African Languages
A Focus on Neural Machine Translation for African Languages
Laura Martinus
Jade Z. Abbott
139
41
0
11 Jun 2019
Word-level Speech Recognition with a Letter to Word Encoder
Word-level Speech Recognition with a Letter to Word Encoder
R. Collobert
Awni Y. Hannun
Gabriel Synnaeve
3DV
230
4
0
10 Jun 2019
The University of Helsinki submissions to the WMT19 news translation
  task
The University of Helsinki submissions to the WMT19 news translation taskConference on Machine Translation (WMT), 2019
Aarne Talman
U. Sulubacak
Ananda Sreenidhi
Yves Scherrer
Sami Virpioja
Alessandro Raganato
A. Hurskainen
Jörg Tiedemann
VLM
96
7
0
10 Jun 2019
Sequence Tagging with Contextual and Non-Contextual Subword
  Representations: A Multilingual Evaluation
Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Benjamin Heinzerling
Michael Strube
125
37
0
04 Jun 2019
Hierarchical Transformers for Multi-Document Summarization
Hierarchical Transformers for Multi-Document SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Yang Liu
Mirella Lapata
245
309
0
30 May 2019
An Investigation of Transfer Learning-Based Sentiment Analysis in
  Japanese
An Investigation of Transfer Learning-Based Sentiment Analysis in JapaneseAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Enkhbold Bataa
Joshua Wu
138
34
0
23 May 2019
Target Conditioned Sampling: Optimizing Data Selection for Multilingual
  Neural Machine Translation
Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Xinyi Wang
Graham Neubig
159
27
0
20 May 2019
Transformers with convolutional context for ASR
Transformers with convolutional context for ASR
Abdel-rahman Mohamed
Dmytro Okhonko
Luke Zettlemoyer
184
172
0
26 Apr 2019
Importance of Copying Mechanism for News Headline Generation
Importance of Copying Mechanism for News Headline Generation
I. Gusev
106
10
0
25 Apr 2019
Sequence-to-Sequence Speech Recognition with Time-Depth Separable
  Convolutions
Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions
Awni Y. Hannun
Ann Lee
Qiantong Xu
R. Collobert
147
104
0
04 Apr 2019
A Large-Scale Multi-Length Headline Corpus for Analyzing
  Length-Constrained Headline Generation Model Evaluation
A Large-Scale Multi-Length Headline Corpus for Analyzing Length-Constrained Headline Generation Model Evaluation
Yuta Hitomi
Yuya Taguchi
Hideaki Tamori
Ko Kikuta
Jiro Nishitoba
Naoaki Okazaki
Kentaro Inui
Manabu Okumura
214
9
0
28 Mar 2019
Previous
123...404142
Next