Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1910.09932
Cited By
v1
v2
v3 (latest)
Improving Transformer-based Speech Recognition Using Unsupervised Pre-training
22 October 2019
Dongwei Jiang
Xiaoning Lei
Wubo Li
Ne Luo
Yuxuan Hu
Wei Zou
Xiangang Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Improving Transformer-based Speech Recognition Using Unsupervised Pre-training"
50 / 56 papers shown
Title
Contrastive Augmentation: An Unsupervised Learning Approach for Keyword Spotting in Speech Technology
Weinan Dai
Yifeng Jiang
Yuanjing Liu
Jinkun Chen
Xin Sun
Jinglei Tao
SSL
132
1
0
31 Aug 2024
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
Afra Alishahi
144
19
0
15 Oct 2023
Indonesian Automatic Speech Recognition with XLSR-53
Social Science Research Network (SSRN), 2022
Panji Arisaputra
Amalia Zahra
100
10
0
20 Aug 2023
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
Interspeech (Interspeech), 2023
Gene-Ping Yang
Yue Gu
Qingming Tang
Dongsu Du
Yuzong Liu
145
6
0
06 Jul 2023
SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model
IEEE International Joint Conference on Neural Network (IJCNN), 2023
Jianzong Wang
Xulong Zhang
Haobin Tang
Aolan Sun
Ning Cheng
Jing Xiao
188
1
0
23 Apr 2023
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Yuchen Hu
Cheng Chen
Qiu-shi Zhu
Eng Siong Chng
257
17
0
11 Apr 2023
Self-supervised speech representation learning for keyword-spotting with light-weight transformers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chenyang Gao
Yue Gu
Francesco Calivá
Yuzong Liu
OffRL
140
6
0
07 Mar 2023
Dual Learning for Large Vocabulary On-Device ASR
Spoken Language Technology Workshop (SLT), 2023
Cal Peyser
Ronny Huang
Tara N. Sainath
Rohit Prabhavalkar
M. Picheny
K. Cho
SSL
129
1
0
11 Jan 2023
PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation
Neural Information Processing Systems (NeurIPS), 2022
Maxwell A. Xu
Alexander Moreno
Supriya Nagesh
V. Aydemir
D. Wetter
Santosh Kumar
James M. Rehg
AI4TS
137
10
0
14 Dec 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
Automatic Speech Recognition & Understanding (ASRU), 2022
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
196
18
0
17 Nov 2022
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach
International Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022
Xulong Zhang
Jianzong Wang
Ning Cheng
Kexin Zhu
Jing Xiao
118
1
0
25 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Spoken Language Technology Workshop (SLT), 2022
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
196
38
0
16 Oct 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
International Society for Music Information Retrieval Conference (ISMIR), 2022
Longshen Ou
Xiangming Gu
Ye Wang
157
24
0
20 Jul 2022
MET: Masked Encoding for Tabular Data
Kushal Majmundar
Sachin Goyal
Praneeth Netrapalli
Prateek Jain
LMTD
110
0
0
17 Jun 2022
Speaker Identification using Speech Recognition
Syeda Rabia Arshad
Syed Mujtaba Haider
Abdul Basit Mughal
96
1
0
29 May 2022
Speech Augmentation Based Unsupervised Learning for Keyword Spotting
IEEE International Joint Conference on Neural Network (IJCNN), 2022
Jian Luo
Jianzong Wang
Ning Cheng
Haobin Tang
Jing Xiao
SSL
136
2
0
28 May 2022
Adaptive multilingual speech recognition with pretrained models
Interspeech (Interspeech), 2022
Ngoc-Quan Pham
A. Waibel
Jan Niehues
VLM
141
24
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
574
433
0
21 May 2022
Audio Self-supervised Learning: A Survey
Patterns (Patterns), 2022
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
206
125
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
207
13
0
01 Mar 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qiu-shi Zhu
Jie Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
261
51
0
22 Jan 2022
Lacuna Reconstruction: Self-supervised Pre-training for Low-Resource Historical Document Transcription
Nikolai Vogler
J. Allen
M. Miller
Taylor Berg-Kirkpatrick
89
5
0
16 Dec 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Suwon Shon
Ankita Pasad
Felix Wu
Pablo Brusco
Yoav Artzi
Karen Livescu
Kyu Jeong Han
AuLLM
ELM
219
90
0
19 Nov 2021
SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition
Interspeech (Interspeech), 2021
Li Fu
Xiaoxiao Li
Runyu Wang
Lu Fan
Zhengchen Zhang
Meng Chen
Youzheng Wu
Xiaodong He
SSL
140
3
0
08 Oct 2021
CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Hang Li
Yunxing Kang
Tianqiao Liu
Wenbiao Ding
Zitao Liu
142
20
0
01 Sep 2021
CLSRIL-23: Cross Lingual Speech Representations for Indic Languages
Anirudh Gupta
Harveen Singh Chadha
Priyanshi Shah
Neeraj Chimmwal
Ankur Dhuriya
Rishabh Gaur
Vivek Raghavan
122
41
0
15 Jul 2021
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Interspeech (Interspeech), 2021
Jian Luo
Jianzong Wang
Ning Cheng
Jing Xiao
SSL
135
6
0
09 Jul 2021
Low Resource German ASR with Untranscribed Data Spoken by Non-native Children -- INTERSPEECH 2021 Shared Task SPAPL System
Interspeech (Interspeech), 2021
Jinhan Wang
Yunzheng Zhu
Ruchao Fan
Wei Chu
Abeer Alwan
90
8
0
18 Jun 2021
Speech BERT Embedding For Improving Prosody in Neural TTS
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Liping Chen
Yan Deng
Xi Wang
Frank Soong
Lei He
185
25
0
08 Jun 2021
Unsupervised Speech Recognition
Neural Information Processing Systems (NeurIPS), 2021
Alexei Baevski
Wei-Ning Hsu
Alexis Conneau
Michael Auli
SSL
337
292
0
24 May 2021
Improving speech recognition models with small samples for air traffic control systems
Neurocomputing (Neurocomputing), 2021
Yi Lin
Qin Li
Bo Yang
Zhen Yan
Huachun Tan
Zhengmao Chen
166
33
0
16 Feb 2021
Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Ruchao Fan
Amber Afshan
Abeer Alwan
124
14
0
12 Feb 2021
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition
IEEE Signal Processing Letters (IEEE SPL), 2021
Cheng Yi
Shiyu Zhou
Bo Xu
152
44
0
17 Jan 2021
Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages
Cheng Yi
Jianzhong Wang
Ning Cheng
Shiyu Zhou
Bo Xu
SSL
VLM
147
87
0
22 Dec 2020
Sequence-to-Sequence Contrastive Learning for Text Recognition
Computer Vision and Pattern Recognition (CVPR), 2020
Aviad Aberdam
Ron Litman
Shahar Tsiper
Oron Anschel
Ron Slossberg
Shai Mazor
R. Manmatha
Pietro Perona
202
121
0
20 Dec 2020
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization
Shaoshi Ling
Yuzong Liu
140
112
0
11 Dec 2020
Exploring wav2vec 2.0 on speaker verification and language identification
Interspeech (Interspeech), 2020
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
224
223
0
11 Dec 2020
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
165
7
0
11 Nov 2020
Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Xavier Favory
Konstantinos Drossos
Maria Sandsten
Xavier Serra
195
16
0
27 Oct 2020
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Interspeech (Interspeech), 2020
Dongwei Jiang
Wubo Li
Miao Cao
Wei Zou
Xiangang Li
SSL
254
71
0
27 Oct 2020
Similarity Analysis of Self-Supervised Speech Representations
Yu-An Chung
Yonatan Belinkov
James R. Glass
SSL
330
44
0
22 Oct 2020
Self-training and Pre-training are Complementary for Speech Recognition
Qiantong Xu
Alexei Baevski
Tatiana Likhomanenko
Paden Tomasello
Alexis Conneau
R. Collobert
Gabriel Synnaeve
Michael Auli
SSL
VLM
243
176
0
22 Oct 2020
A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation
Mingshuo Ding
Yi Ma
123
1
0
15 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
Junwen Bai
Weiran Wang
Yingbo Zhou
Caiming Xiong
SSL
AI4TS
169
12
0
07 Oct 2020
Transformer with Bidirectional Decoder for Speech Recognition
Interspeech (Interspeech), 2020
Xi Chen
Songyang Zhang
Dandan Song
P. Ouyang
Shouyi Yin
111
15
0
11 Aug 2020
Transformer based unsupervised pre-training for acoustic representation learning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Ruixiong Zhang
Haiwei Wu
Wubo Li
Dongwei Jiang
Wei Zou
Xiangang Li
SSL
ViT
223
30
0
29 Jul 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Andy T. Liu
Shang-Wen Li
Hung-yi Lee
SSL
484
389
0
12 Jul 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Interspeech (Interspeech), 2020
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
301
900
0
24 Jun 2020
Embodied Self-supervised Learning by Coordinated Sampling and Training
Yifan Sun
Xihong Wu
SSL
116
9
0
20 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
1.1K
7,195
0
20 Jun 2020
1
2
Next