Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1904.03240
Cited By
v1
v2 (latest)
An Unsupervised Autoregressive Model for Speech Representation Learning
5 April 2019
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"An Unsupervised Autoregressive Model for Speech Representation Learning"
50 / 269 papers shown
Title
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition
Interspeech (Interspeech), 2023
Wangyou Zhang
Y. Qian
187
12
0
25 May 2023
Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?
Interspeech (Interspeech), 2023
Eklavya Sarkar
Mathew Magimai.-Doss
184
17
0
23 May 2023
Self-supervised Predictive Coding Models Encode Speaker and Phonetic Information in Orthogonal Subspaces
Interspeech (Interspeech), 2023
Oli Danyi Liu
Hao Tang
Sharon Goldwater
SSL
164
14
0
21 May 2023
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Interspeech (Interspeech), 2023
Ziqian Ning
Yuepeng Jiang
Pengcheng Zhu
Jixun Yao
Shuai Wang
Linfu Xie
Mengxiao Bi
172
13
0
21 May 2023
TrustSER: On the Trustworthiness of Fine-tuning Pre-trained Speech Embeddings For Speech Emotion Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Tiantian Feng
Rajat Hebbar
Shrikanth Narayanan
147
9
0
18 May 2023
Speech Separation based on Contrastive Learning and Deep Modularization
Peter Ochieng
SSL
176
0
0
18 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Neural Information Processing Systems (NeurIPS), 2023
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
410
33
0
17 May 2023
Cocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture and Single-Source Speech
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Maryam Fazel-Zarandi
Wei-Ning Hsu
SSL
121
12
0
20 Mar 2023
IRGen: Generative Modeling for Image Retrieval
European Conference on Computer Vision (ECCV), 2023
Yidan Zhang
Ting Zhang
Dong Chen
Yujing Wang
Qi Chen
...
Tao Gui
Fan Yang
Mao Yang
Q. Liao
B. Guo
3DV
VLM
283
21
0
17 Mar 2023
Analysing the Masked predictive coding training criterion for pre-training a Speech Representation Model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
187
4
0
13 Mar 2023
Accommodating Audio Modality in CLIP for Multimodal Processing
AAAI Conference on Artificial Intelligence (AAAI), 2023
Ludan Ruan
Anwen Hu
Yuqing Song
Liang Zhang
S. Zheng
Qin Jin
VLM
164
16
0
12 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
198
4
0
07 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
318
342
0
02 Mar 2023
deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Dianwen Ng
Ruixi Zhang
J. Yip
Zhao Yang
Jinjie Ni
Chong Zhang
Yukun Ma
Chongjia Ni
Eng Siong Chng
B. Ma
220
18
0
28 Feb 2023
A low latency attention module for streaming self-supervised speech representation learning
Jianbo Ma
Siqi Pan
Deepak Chandran
A. Fanelli
Richard Cartwright
152
0
0
27 Feb 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Will Rieger
BDL
UQCV
112
0
0
16 Jan 2023
Context-aware Fine-tuning of Self-supervised Speech Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
130
10
0
16 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of the art analysis
Artificial Intelligence Review (Artif Intell Rev), 2022
P. Ochieng
217
33
0
01 Dec 2022
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models
Zih-Ching Chen
Yu-Shun Sung
Hung-yi Lee
173
20
0
01 Dec 2022
VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning
IEEE transactions on multimedia (IEEE TMM), 2022
Qiu-shi Zhu
Long Zhou
Zi-Hua Zhang
Shujie Liu
Binxing Jiao
Jie Zhang
Lirong Dai
Daxin Jiang
Jinyu Li
Furu Wei
211
48
0
21 Nov 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
Automatic Speech Recognition & Understanding (ASRU), 2022
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
196
18
0
17 Nov 2022
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
234
6
0
17 Nov 2022
Introducing Semantics into Speech Encoders
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Derek Xu
Shuyan Dong
Changhan Wang
Suyoun Kim
Mohammad Kachuee
...
Alexei Baevski
Guan-Ting Lin
Hung-yi Lee
Luke Huan
Wei Wang
SSL
148
3
0
15 Nov 2022
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Renée Lu
M. Shahin
Beena Ahmed
157
7
0
14 Nov 2022
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Interspeech (Interspeech), 2022
Ziyang Ma
Zhisheng Zheng
Changli Tang
Yujin Wang
Xie Chen
294
21
0
14 Nov 2022
Investigating Enhancements to Contrastive Predictive Coding for Human Activity Recognition
Annual IEEE International Conference on Pervasive Computing and Communications (PerCom), 2022
H. Haresamudram
Irfan Essa
Thomas Ploetz
AI4TS
270
20
0
11 Nov 2022
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
165
4
0
10 Nov 2022
Biased Self-supervised learning for ASR
Interspeech (Interspeech), 2022
Florian Kreyssig
Yangyang Shi
Jinxi Guo
Leda Sari
Abdel-rahman Mohamed
P. Woodland
SSL
143
4
0
04 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Neural Information Processing Systems (NeurIPS), 2022
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
316
10
0
02 Nov 2022
Speech-text based multi-modal training with bidirectional attention for improved speech recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yuhang Yang
Haihua Xu
Hao-Ming Huang
Eng Siong Chng
Sheng Li
160
7
0
01 Nov 2022
Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Sathvik Udupa
Siddarth C
P. Ghosh
155
10
0
30 Oct 2022
Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Sung-Lin Yeh
Hao Tang
SSL
BDL
112
1
0
29 Oct 2022
Application of Knowledge Distillation to Multi-task Speech Representation Learning
Interspeech (Interspeech), 2022
Mine Kerpicci
V. Nguyen
Shuhua Zhang
Erik M. Visser
156
0
0
29 Oct 2022
FedAudio: A Federated Learning Benchmark for Audio Tasks
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Tuo Zhang
Tiantian Feng
Samiul Alam
Sunwoo Lee
Mi Zhang
Shrikanth S. Narayanan
Salman Avestimehr
FedML
227
30
0
27 Oct 2022
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach
International Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022
Xulong Zhang
Jianzong Wang
Ning Cheng
Kexin Zhu
Jing Xiao
118
1
0
25 Oct 2022
Guided contrastive self-supervised pre-training for automatic speech recognition
Spoken Language Technology Workshop (SLT), 2022
Aparna Khare
Minhua Wu
Saurabhchand Bhati
J. Droppo
Roland Maas
SSL
148
0
0
22 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Spoken Language Technology Workshop (SLT), 2022
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
196
38
0
16 Oct 2022
CTCBERT: Advancing Hidden-unit BERT with CTC Objectives
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Ruchao Fan
Yiming Wang
Yashesh Gaur
Jinyu Li
244
8
0
16 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
Spoken Language Technology Workshop (SLT), 2022
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Nigel G. Ward
125
59
0
13 Oct 2022
Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora
Spoken Language Technology Workshop (SLT), 2022
Yuanchao Li
Yumnah Mohamied
P. Bell
Catherine Lai
SSL
290
51
0
05 Oct 2022
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Spoken Language Technology Workshop (SLT), 2022
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Layne Berry
Hung-yi Lee
David Harwath
VLM
CLIP
335
39
0
03 Oct 2022
Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling
International Workshop on Spoken Language Translation (IWSLT), 2022
Itai Gat
Felix Kreuk
Tu Nguyen
Ann Lee
Jade Copet
Gabriel Synnaeve
Emmanuel Dupoux
Yossi Adi
179
14
0
30 Sep 2022
SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Zi-Hua Zhang
Sanyuan Chen
Long Zhou
Yu Wu
Shuo Ren
...
Zhuoyuan Yao
Xun Gong
Lirong Dai
Jinyu Li
Furu Wei
241
64
0
30 Sep 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations
Tung-Yu Wu
Chen-An Li
Tzu-Han Lin
Tsung-Yuan Hsu
Hung-yi Lee
178
6
0
26 Sep 2022
Non-Contrastive Self-supervised Learning for Utterance-Level Information Extraction from Speech
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Jaejin Cho
Jesús Villalba
Laureano Moro-Velazquez
Najim Dehak
SSL
164
22
0
10 Aug 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
International Society for Music Information Retrieval Conference (ISMIR), 2022
Longshen Ou
Xiangming Gu
Ye Wang
157
24
0
20 Jul 2022
Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models
Interspeech (Interspeech), 2022
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
143
34
0
14 Jul 2022
FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Interspeech (Interspeech), 2022
Szu-Jui Chen
Jiamin Xie
John H. L. Hansen
170
9
0
30 Jun 2022
Wav2Vec-Aug: Improved self-supervised training with limited data
Interspeech (Interspeech), 2022
Anuroop Sriram
Michael Auli
Alexei Baevski
SSL
VLM
153
16
0
27 Jun 2022
Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models
Han Ji
T. Patel
O. Scharenborg
200
9
0
24 Jun 2022
Previous
1
2
3
4
5
6
Next