Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1904.03240
Cited By
v1
v2 (latest)
An Unsupervised Autoregressive Model for Speech Representation Learning
5 April 2019
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"An Unsupervised Autoregressive Model for Speech Representation Learning"
50 / 269 papers shown
Title
Boosting Cross-Domain Speech Recognition with Self-Supervision
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Hanjing Zhu
Gaofeng Cheng
Yongfeng Zhang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
262
21
0
20 Jun 2022
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR
Interspeech (Interspeech), 2022
Ruchao Fan
Abeer Alwan
169
37
0
16 Jun 2022
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions
ACM Computing Surveys (ACM CSUR), 2022
Sheng Zhou
Hongjia Xu
Zhuonan Zheng
Jiawei Chen
Zhao Li
Jiajun Bu
Jia Wu
Xin Eric Wang
Wenwu Zhu
Martin Ester
208
148
0
15 Jun 2022
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition
Interspeech (Interspeech), 2022
Anjana Arunkumar
Vrunda N. Sukhadia
S. Umesh
146
17
0
11 Jun 2022
Speak Like a Dog: Human to Non-human creature Voice Conversion
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022
Kohei Suzuki
Shoki Sakamoto
T. Taniguchi
Hirokazu Kameoka
107
4
0
09 Jun 2022
Joint Encoder-Decoder Self-Supervised Pre-training for ASR
Interspeech (Interspeech), 2022
Arunkumar A
S. Umesh
SSL
112
9
0
09 Jun 2022
Speech Augmentation Based Unsupervised Learning for Keyword Spotting
IEEE International Joint Conference on Neural Network (IJCNN), 2022
Jian Luo
Jianzong Wang
Ning Cheng
Haobin Tang
Jing Xiao
SSL
140
2
0
28 May 2022
Self-supervised models of audio effectively explain human cortical responses to speech
International Conference on Machine Learning (ICML), 2022
Aditya R. Vaidya
Shailee Jain
Alexander G. Huth
177
68
0
27 May 2022
Contrastive Siamese Network for Semi-supervised Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
S. Khorram
Jaeyoung Kim
Anshuman Tripathi
Han Lu
Qian Zhang
Hasim Sak
SSL
180
16
0
27 May 2022
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Qiu-shi Zhu
Jie Zhang
Zitian Zhang
Lirong Dai
170
18
0
26 May 2022
Self-Supervised Speech Representation Learning: A Review
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
578
435
0
21 May 2022
Deploying self-supervised learning in the wild for hybrid automatic speech recognition
Mostafa Karimi
Changliang Liu
K. Kumatani
Yao Qian
Tianyu Wu
Jian Wu
121
3
0
17 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
142
9
0
08 May 2022
Sound Localization by Self-Supervised Time Delay Estimation
European Conference on Computer Vision (ECCV), 2022
Ziyang Chen
David Fouhey
Andrew Owens
SSL
198
23
0
26 Apr 2022
On-demand compute reduction with stochastic wav2vec 2.0
Interspeech (Interspeech), 2022
Apoorv Vyas
Wei-Ning Hsu
Michael Auli
Alexei Baevski
150
13
0
25 Apr 2022
Cross-stitched Multi-modal Encoders
Karan Singla
Daniel Pressel
Ryan Price
Bhargav Srinivas Chinnari
Yeon-Jun Kim
S. Bangalore
135
0
0
20 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
International Conference on Machine Learning (ICML), 2022
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
161
140
0
20 Apr 2022
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores
Interspeech (Interspeech), 2022
Wei-Cheng Tseng
Wei-Tsung Kao
Hung-yi Lee
277
24
0
07 Apr 2022
User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition in Federated Learning
Interspeech (Interspeech), 2022
Tiantian Feng
Raghuveer Peri
Shrikanth Narayanan
FedML
174
36
0
05 Apr 2022
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Marc-Antoine Georges
Julien Diard
Laurent Girin
J. Schwartz
Thomas Hueber
85
7
0
05 Apr 2022
Autoregressive Co-Training for Learning Discrete Speech Representations
Interspeech (Interspeech), 2022
Sung-Lin Yeh
Hao Tang
SSL
163
6
0
29 Mar 2022
Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Interspeech (Interspeech), 2022
Lester Phillip Violeta
Wen-Chin Huang
Tomoki Toda
209
44
0
29 Mar 2022
Federated Self-Supervised Learning for Acoustic Event Classification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Meng Feng
Chieh-Chi Kao
Qingming Tang
Ming Sun
Viktor Rozgic
Spyros Matsoukas
Chao Wang
146
14
0
22 Mar 2022
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
Interspeech (Interspeech), 2022
Tiantian Feng
Shrikanth Narayanan
121
23
0
15 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
243
122
0
14 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
182
21
0
09 Mar 2022
Audio Self-supervised Learning: A Survey
Patterns (Patterns), 2022
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
210
127
0
02 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Ramon Sanabria
Wei-Ning Hsu
Alexei Baevski
Michael Auli
195
8
0
01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
207
13
0
01 Mar 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Herman Kamper
231
31
0
24 Feb 2022
Assessing the State of Self-Supervised Human Activity Recognition using Wearables
Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2022
H. Haresamudram
Irfan Essa
Thomas Plötz
SSL
296
112
0
22 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
143
11
0
10 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
International Conference on Machine Learning (ICML), 2022
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSL
VLM
ViT
424
1,017
0
07 Feb 2022
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling
Puyuan Peng
David Harwath
SSL
184
28
0
07 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Bethan Thomas
Samuel Kessler
S. Karout
131
82
0
07 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
198
60
0
02 Feb 2022
Supervised and Self-supervised Pretraining Based COVID-19 Detection Using Acoustic Breathing/Cough/Speech Signals
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xing-Yu Chen
Qiu-shi Zhu
Jie Zhang
Lirong Dai
163
16
0
22 Jan 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qiu-shi Zhu
Jie Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
277
51
0
22 Jan 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
International Conference on Learning Representations (ICLR), 2022
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
292
406
0
05 Jan 2022
Discrete and continuous representations and processing in deep learning: Looking forward
AI Open (AO), 2022
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
277
28
0
04 Jan 2022
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings
Tiantian Feng
H. Hashemi
Rajat Hebbar
M. Annavaram
Shrikanth S. Narayanan
295
29
0
26 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
161
33
0
16 Dec 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Suwon Shon
Ankita Pasad
Felix Wu
Pablo Brusco
Yoav Artzi
Karen Livescu
Kyu Jeong Han
AuLLM
ELM
219
90
0
19 Nov 2021
Membership Inference Attacks Against Self-supervised Speech Models
Interspeech (Interspeech), 2021
Wei-Cheng Tseng
Wei-Tsung Kao
Hung-yi Lee
319
17
0
09 Nov 2021
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
223
181
0
04 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
APSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
350
422
0
02 Nov 2021
Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
SSL
113
2
0
29 Oct 2021
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Heming Wang
Yao Qian
Xiaofei Wang
Yiming Wang
Chengyi Wang
Shujie Liu
Takuya Yoshioka
Jinyu Li
DeLiang Wang
203
33
0
28 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
735
2,571
0
26 Oct 2021
SSAST: Self-Supervised Audio Spectrogram Transformer
Yuan Gong
Cheng-I Jeff Lai
Yu-An Chung
James R. Glass
ViT
275
350
0
19 Oct 2021
Previous
1
2
3
4
5
6
Next