ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.06226
  4. Cited By
SentencePiece: A simple and language independent subword tokenizer and
  detokenizer for Neural Text Processing

SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing

19 August 2018
Taku Kudo
John Richardson
ArXivPDFHTML

Papers citing "SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing"

50 / 1,923 papers shown
Title
Toward Joint Language Modeling for Speech Units and Text
Toward Joint Language Modeling for Speech Units and Text
Ju-Chieh Chou
Chung-Ming Chien
Wei-Ning Hsu
Karen Livescu
Arun Babu
Alexis Conneau
Alexei Baevski
Michael Auli
VLM
28
20
0
12 Oct 2023
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Wei Ping
Ming-Yu Liu
Lawrence C. McAfee
Peng Xu
Bo Li
Mohammad Shoeybi
Bryan Catanzaro
RALM
21
47
0
11 Oct 2023
MatFormer: Nested Transformer for Elastic Inference
MatFormer: Nested Transformer for Elastic Inference
Devvrit
Sneha Kudugunta
Aditya Kusupati
Tim Dettmers
Kaifeng Chen
...
Yulia Tsvetkov
Hannaneh Hajishirzi
Sham Kakade
Ali Farhadi
Prateek Jain
44
23
0
11 Oct 2023
An Empirical Study of Instruction-tuning Large Language Models in
  Chinese
An Empirical Study of Instruction-tuning Large Language Models in Chinese
Q. Si
Tong Wang
Zheng Lin
Xu Zhang
Yanan Cao
Weiping Wang
ALM
74
16
0
11 Oct 2023
On the Impact of Cross-Domain Data on German Language Models
On the Impact of Cross-Domain Data on German Language Models
Amin Dada
Aokun Chen
C.A.I. Peng
Kaleb E. Smith
Ahmad Idrissi-Yaghir
...
Daniel Truhn
Jan Egger
Jiang Bian
Jens Kleesiek
Yonghui Wu
19
5
0
11 Oct 2023
BioT5: Enriching Cross-modal Integration in Biology with Chemical
  Knowledge and Natural Language Associations
BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations
Qizhi Pei
Wei Zhang
Jinhua Zhu
Kehan Wu
Kaiyuan Gao
Lijun Wu
Yingce Xia
Rui Yan
36
65
0
11 Oct 2023
Acoustic Model Fusion for End-to-end Speech Recognition
Acoustic Model Fusion for End-to-end Speech Recognition
Zhihong Lei
Mingbin Xu
Shiyi Han
Leo Liu
Zhen Huang
...
Yuanyuan Zhang
Ernest Pusateri
Mirko Hannemann
Yaqiao Deng
Man-Hung Siu
29
5
0
10 Oct 2023
No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech
  Recognition through Pitch Manipulation
No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Dennis Fucci
Marco Gaido
Matteo Negri
Mauro Cettolo
L. Bentivogli
34
5
0
10 Oct 2023
Task-Adaptive Tokenization: Enhancing Long-Form Text Generation Efficacy
  in Mental Health and Beyond
Task-Adaptive Tokenization: Enhancing Long-Form Text Generation Efficacy in Mental Health and Beyond
Siyang Liu
Naihao Deng
Sahand Sabour
Yilin Jia
Minlie Huang
Rada Mihalcea
38
18
0
09 Oct 2023
Neural Language Model Pruning for Automatic Speech Recognition
Neural Language Model Pruning for Automatic Speech Recognition
Leonardo Emili
Thiago Fraga-Silva
Ernest Pusateri
M. Nußbaum-Thom
Youssef Oualil
41
1
0
05 Oct 2023
Kosmos-G: Generating Images in Context with Multimodal Large Language
  Models
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
Xichen Pan
Li Dong
Shaohan Huang
Zhiliang Peng
Wenhu Chen
Furu Wei
VLM
11
62
0
04 Oct 2023
ResidualTransformer: Residual Low-Rank Learning with Weight-Sharing for
  Transformer Layers
ResidualTransformer: Residual Low-Rank Learning with Weight-Sharing for Transformer Layers
Yiming Wang
Jinyu Li
20
4
0
03 Oct 2023
Stack Attention: Improving the Ability of Transformers to Model
  Hierarchical Patterns
Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
Brian DuSell
David Chiang
33
12
0
03 Oct 2023
One model to rule them all ? Towards End-to-End Joint Speaker
  Diarization and Speech Recognition
One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
Samuele Cornell
Jee-weon Jung
Shinji Watanabe
S. Squartini
VLM
32
16
0
02 Oct 2023
Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot
  Translation
Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation
Junjie Yang
Liang Ding
Li Shen
Matthieu Labeau
Yibing Zhan
Weifeng Liu
Dacheng Tao
VLM
43
4
0
28 Sep 2023
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
36
15
0
28 Sep 2023
Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
Bipin Rajendran
Bashir M. Al-Hashimi
MLLM
VLM
37
2
0
27 Sep 2023
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
  Parameter Sharing
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing
B. Grimstad
Xuankai Chang
Antonios Anastasopoulos
Yuya Fujita
Shinji Watanabe
32
2
0
27 Sep 2023
Enhancing End-to-End Conversational Speech Translation Through Target
  Language Context Utilization
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
A. Hussein
Brian Yan
Antonios Anastasopoulos
Shinji Watanabe
Sanjeev Khudanpur
46
3
0
27 Sep 2023
Speech collage: code-switched audio generation by collaging monolingual
  corpora
Speech collage: code-switched audio generation by collaging monolingual corpora
A. Hussein
Dorsa Zeinali
Ondˇrej Klejch
Sanjeev Khudanpur
Brian Yan
Shammur A. Chowdhury
Ahmed M. Ali
Shinji Watanabe
Sanjeev Khudanpur
27
1
0
27 Sep 2023
Direct Models for Simultaneous Translation and Automatic Subtitling:
  FBK@IWSLT2023
Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023
Sara Papi
Marco Gaido
Matteo Negri
48
7
0
27 Sep 2023
Segmentation-Free Streaming Machine Translation
Segmentation-Free Streaming Machine Translation
Javier Iranzo-Sánchez
Jorge Iranzo-Sánchez
Adria Giménez
Jorge Civera Saiz
Alfons Juan
VOS
31
1
0
26 Sep 2023
Small-scale proxies for large-scale Transformer training instabilities
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
40
86
0
25 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and
  Publicly Available Data
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
44
36
0
25 Sep 2023
Importance of Smoothness Induced by Optimizers in FL4ASR: Towards
  Understanding Federated Learning for End-to-End ASR
Importance of Smoothness Induced by Optimizers in FL4ASR: Towards Understanding Federated Learning for End-to-End ASR
Sheikh Shams Azam
Tatiana Likhomanenko
Martin Pelikan
Jan Honza Silovsky
34
6
0
22 Sep 2023
Domain Adaptation for Arabic Machine Translation: The Case of Financial
  Texts
Domain Adaptation for Arabic Machine Translation: The Case of Financial Texts
Emad A. Alghamdi
Jezia Zakraoui
Fares A. Abanmy
36
1
0
22 Sep 2023
JCoLA: Japanese Corpus of Linguistic Acceptability
JCoLA: Japanese Corpus of Linguistic Acceptability
Taiga Someya
Yushi Sugimoto
Yohei Oseki
37
5
0
22 Sep 2023
Exploring the Impact of Training Data Distribution and Subword
  Tokenization on Gender Bias in Machine Translation
Exploring the Impact of Training Data Distribution and Subword Tokenization on Gender Bias in Machine Translation
Bar Iluz
Tomasz Limisiewicz
Gabriel Stanovsky
David Marevcek
40
3
0
21 Sep 2023
Kosmos-2.5: A Multimodal Literate Model
Kosmos-2.5: A Multimodal Literate Model
Tengchao Lv
Yupan Huang
Jingye Chen
Lei Cui
Shuming Ma
...
Weiyao Luo
Shaoxiang Wu
Guoxin Wang
Cha Zhang
Furu Wei
VLM
MLLM
39
64
0
20 Sep 2023
Sequence-to-Sequence Spanish Pre-trained Language Models
Sequence-to-Sequence Spanish Pre-trained Language Models
Vladimir Araujo
Maria Mihaela Truşcǎ
Rodrigo Tufino
Marie-Francine Moens
37
2
0
20 Sep 2023
The Languini Kitchen: Enabling Language Modelling Research at Different
  Scales of Compute
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute
Aleksandar Stanić
Dylan R. Ashley
Oleg Serikov
Louis Kirsch
Francesco Faccio
Jürgen Schmidhuber
Thomas Hofmann
Imanol Schlag
MoE
48
9
0
20 Sep 2023
MBR and QE Finetuning: Training-time Distillation of the Best and Most
  Expensive Decoding Methods
MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods
M. Finkelstein
Subhajit Naskar
Mehdi Mirzazadeh
Apurva Shah
Markus Freitag
53
26
0
19 Sep 2023
A Family of Pretrained Transformer Language Models for Russian
A Family of Pretrained Transformer Language Models for Russian
Dmitry Zmitrovich
Alexander Abramov
Andrey Kalmykov
Maria Tikhonova
Ekaterina Taktasheva
...
Vitalii Kadulin
Sergey Markov
Tatiana Shavrina
Vladislav Mikhailov
Alena Fenogenova
33
26
0
19 Sep 2023
Discrete Audio Representation as an Alternative to Mel-Spectrograms for
  Speaker and Speech Recognition
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition
Krishna C. Puvvada
Nithin Rao Koluguri
Kunal Dhawan
Jagadeesh Balam
Boris Ginsburg
39
13
0
19 Sep 2023
Language Modeling Is Compression
Language Modeling Is Compression
Grégoire Delétang
Anian Ruoss
Paul-Ambroise Duquenne
Elliot Catt
Tim Genewein
...
Wenliang Kevin Li
Matthew Aitchison
Laurent Orseau
Marcus Hutter
J. Veness
AI4CE
53
133
0
19 Sep 2023
Nebula: Self-Attention for Dynamic Malware Analysis
Nebula: Self-Attention for Dynamic Malware Analysis
Dmitrijs Trizna
Christian Scano
Battista Biggio
Fabio Roli
24
13
0
19 Sep 2023
Baichuan 2: Open Large-scale Language Models
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Zenan Zhou
Zhiying Wu
ELM
LRM
77
712
0
19 Sep 2023
Adapting Large Language Models via Reading Comprehension
Adapting Large Language Models via Reading Comprehension
Daixuan Cheng
Shaohan Huang
Furu Wei
CLL
SyDa
AI4CE
32
34
0
18 Sep 2023
Improved Factorized Neural Transducer Model For text-only Domain
  Adaptation
Improved Factorized Neural Transducer Model For text-only Domain Adaptation
Jing Liu
Jianwei Yu
Xie Chen
48
1
0
18 Sep 2023
How Transferable are Attribute Controllers on Pretrained Multilingual
  Translation Models?
How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?
Danni Liu
Jan Niehues
18
3
0
15 Sep 2023
Visual Speech Recognition for Languages with Limited Labeled Data using
  Automatic Labels from Whisper
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper
Jeong Hun Yeo
Minsu Kim
Shinji Watanabe
Y. Ro
VLM
34
12
0
15 Sep 2023
CoCA: Fusing Position Embedding with Collinear Constrained Attention in
  Transformers for Long Context Window Extending
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending
Shiyi Zhu
Jingting Ye
Wei Jiang
Siqiao Xue
Qi Zhang
Yifan Wu
Jianguo Li
32
4
0
15 Sep 2023
Folding Attention: Memory and Power Optimization for On-Device
  Transformer-based Streaming Speech Recognition
Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Yang Li
Liangzhen Lai
Shangguan Yuan
Forrest N. Iandola
Zhaoheng Ni
Ernie Chang
Yangyang Shi
Vikas Chandra
34
2
0
14 Sep 2023
Incorporating Class-based Language Model for Named Entity Recognition in
  Factorized Neural Transducer
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
Peng Wang
Yifan Yang
Zheng Liang
Tian Tan
Shiliang Zhang
Xie Chen
23
0
0
14 Sep 2023
Voxtlm: unified decoder-only models for consolidating speech
  recognition/synthesis and speech/text continuation tasks
Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Soumi Maiti
Yifan Peng
Shukjae Choi
Jee-weon Jung
Xuankai Chang
Shinji Watanabe
VLM
AuLLM
29
58
0
14 Sep 2023
The first step is the hardest: Pitfalls of Representing and Tokenizing
  Temporal Data for Large Language Models
The first step is the hardest: Pitfalls of Representing and Tokenizing Temporal Data for Large Language Models
Dimitris Spathis
F. Kawsar
AI4TS
41
18
0
12 Sep 2023
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Tuan Dung Nguyen
Yuan-Sen Ting
I. Ciucă
Charlie OÑeill
Ze-Chang Sun
...
Alberto Accomazzi
J. P. Naiman
Jesse Cranney
Kevin Schawinski
UniverseTBD
16
20
0
12 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for
  Self-supervised Representations of French Speech
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
F. Ringeval
D. Schwab
Laurent Besacier
45
15
0
11 Sep 2023
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Sneha Kudugunta
Isaac Caswell
Biao Zhang
Xavier Garcia
Christopher A. Choquette-Choo
...
Derrick Xin
Aditya Kusupati
Romi Stella
Ankur Bapna
Orhan Firat
75
120
0
09 Sep 2023
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech
  Recognition
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Huaibo Zhao
Yosuke Higuchi
Yusuke Kida
Tetsuji Ogawa
Tetsunori Kobayashi
28
1
0
09 Sep 2023
Previous
123...111213...373839
Next