v1v2 (latest)

An Unsupervised Autoregressive Model for Speech Representation Learning

5 April 2019

Hao Tang

Papers citing "An Unsupervised Autoregressive Model for Speech Representation Learning"

50 / 269 papers shown

Title
Boosting Cross-Domain Speech Recognition with Self-SupervisionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022 Hanjing Zhu Gaofeng Cheng Yongfeng Zhang Wenxin Hou Pengyuan Zhang Yonghong Yan 262 21 0 20 Jun 2022
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASRInterspeech (Interspeech), 2022 Ruchao Fan Abeer Alwan 169 37 0 16 Jun 2022
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future DirectionsACM Computing Surveys (ACM CSUR), 2022 Sheng Zhou Hongjia Xu Zhuonan Zheng Jiawei Chen Zhao Li Jiajun Bu Jia Wu Xin Eric Wang Wenwu Zhu Martin Ester 208 148 0 15 Jun 2022
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech RecognitionInterspeech (Interspeech), 2022 Anjana Arunkumar Vrunda N. Sukhadia S. Umesh 146 17 0 11 Jun 2022
Speak Like a Dog: Human to Non-human creature Voice ConversionAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022 Kohei Suzuki Shoki Sakamoto T. Taniguchi Hirokazu Kameoka 107 4 0 09 Jun 2022
Joint Encoder-Decoder Self-Supervised Pre-training for ASRInterspeech (Interspeech), 2022 Arunkumar A S. Umesh SSL 112 9 0 09 Jun 2022
Speech Augmentation Based Unsupervised Learning for Keyword SpottingIEEE International Joint Conference on Neural Network (IJCNN), 2022 Jian Luo Jianzong Wang Ning Cheng Haobin Tang Jing Xiao SSL 140 2 0 28 May 2022
Self-supervised models of audio effectively explain human cortical responses to speechInternational Conference on Machine Learning (ICML), 2022 Aditya R. Vaidya Shailee Jain Alexander G. Huth 177 68 0 27 May 2022
Contrastive Siamese Network for Semi-supervised Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 S. Khorram Jaeyoung Kim Anshuman Tripathi Han Lu Qian Zhang Hasim Sak SSL 180 16 0 27 May 2022
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR Qiu-shi Zhu Jie Zhang Zitian Zhang Lirong Dai 170 18 0 26 May 2022
Self-Supervised Speech Representation Learning: A ReviewIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022 Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 578 435 0 21 May 2022
Deploying self-supervised learning in the wild for hybrid automatic speech recognition Mostafa Karimi Changliang Liu K. Kumatani Yao Qian Tianyu Wu Jian Wu 121 3 0 17 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information Chiyu Feng Po-Chun Hsu Hung-yi Lee SSL 142 9 0 08 May 2022
Sound Localization by Self-Supervised Time Delay EstimationEuropean Conference on Computer Vision (ECCV), 2022 Ziyang Chen David Fouhey Andrew Owens SSL 198 23 0 26 Apr 2022
On-demand compute reduction with stochastic wav2vec 2.0Interspeech (Interspeech), 2022 Apoorv Vyas Wei-Ning Hsu Michael Auli Alexei Baevski 150 13 0 25 Apr 2022
Cross-stitched Multi-modal Encoders Karan Singla Daniel Pressel Ryan Price Bhargav Srinivas Chinnari Yeon-Jun Kim S. Bangalore 135 0 0 20 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling SpeakersInternational Conference on Machine Learning (ICML), 2022 Kaizhi Qian Yang Zhang Heting Gao Junrui Ni Cheng-I Jeff Lai David D. Cox M. Hasegawa-Johnson Shiyu Chang DRL 161 140 0 20 Apr 2022
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion ScoresInterspeech (Interspeech), 2022 Wei-Cheng Tseng Wei-Tsung Kao Hung-yi Lee 277 24 0 07 Apr 2022
User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition in Federated LearningInterspeech (Interspeech), 2022 Tiantian Feng Raghuveer Peri Shrikanth Narayanan FedML 174 36 0 05 Apr 2022
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Marc-Antoine Georges Julien Diard Laurent Girin J. Schwartz Thomas Hueber 85 7 0 05 Apr 2022
Autoregressive Co-Training for Learning Discrete Speech RepresentationsInterspeech (Interspeech), 2022 Sung-Lin Yeh Hao Tang SSL 163 6 0 29 Mar 2022
Investigating Self-supervised Pretraining Frameworks for Pathological Speech RecognitionInterspeech (Interspeech), 2022 Lester Phillip Violeta Wen-Chin Huang Tomoki Toda 209 44 0 29 Mar 2022
Federated Self-Supervised Learning for Acoustic Event ClassificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Meng Feng Chieh-Chi Kao Qingming Tang Ming Sun Viktor Rozgic Spyros Matsoukas Chao Wang 146 14 0 22 Mar 2022
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-LabelingInterspeech (Interspeech), 2022 Tiantian Feng Shrikanth Narayanan 121 23 0 15 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative CapabilitiesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Hsiang-Sheng Tsai Heng-Jui Chang Wen-Chin Huang Zili Huang Kushal Lakhotia ... Hsuan-Jui Chen Shang-Wen Li Shinji Watanabe Abdel-rahman Mohamed Hung-yi Lee 243 122 0 14 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Yizhou Lu Mingkun Huang Xinghua Qu Pengfei Wei Zejun Ma 182 21 0 09 Mar 2022
Audio Self-supervised Learning: A SurveyPatterns (Patterns), 2022 Shuo Liu Adria Mallol-Ragolta Emilia Parada-Cabeleiro Kun Qian Xingshuo Jing Alexander Kathan Bin Hu Bjoern W. Schuller SSL 210 127 0 02 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training Ramon Sanabria Wei-Ning Hsu Alexei Baevski Michael Auli 195 8 0 01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 207 13 0 01 Mar 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised ScoringIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022 Herman Kamper 231 31 0 24 Feb 2022
Assessing the State of Self-Supervised Human Activity Recognition using WearablesProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2022 H. Haresamudram Irfan Essa Thomas Plötz SSL 296 112 0 22 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding Peter Sullivan Toshiko Shibano Muhammad Abdul-Mageed 143 11 0 10 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and LanguageInternational Conference on Machine Learning (ICML), 2022 Alexei Baevski Wei-Ning Hsu Qiantong Xu Arun Babu Jiatao Gu Michael Auli SSL VLM ViT 424 1,017 0 07 Feb 2022
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling Puyuan Peng David Harwath SSL 184 28 0 07 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Bethan Thomas Samuel Kessler S. Karout 131 82 0 07 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Itai Gat Hagai Aronowitz Weizhong Zhu E. Morais R. Hoory 198 60 0 02 Feb 2022
Supervised and Self-supervised Pretraining Based COVID-19 Detection Using Acoustic Breathing/Cough/Speech SignalsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Xing-Yu Chen Qiu-shi Zhu Jie Zhang Lirong Dai 163 16 0 22 Jan 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Qiu-shi Zhu Jie Zhang Zi-qiang Zhang Ming Wu Xin Fang Lirong Dai 277 51 0 22 Jan 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster PredictionInternational Conference on Learning Representations (ICLR), 2022 Bowen Shi Wei-Ning Hsu Kushal Lakhotia Abdel-rahman Mohamed SSL 292 406 0 05 Jan 2022
Discrete and continuous representations and processing in deep learning: Looking forwardAI Open (AO), 2022 Ruben Cartuyvels Graham Spinks Marie-Francine Moens OCL 277 28 0 04 Jan 2022
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings Tiantian Feng H. Hashemi Rajat Hebbar M. Annavaram Shrikanth S. Narayanan 295 29 0 26 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer supervision Chengyi Wang Yu-Huan Wu Sanyuan Chen Shujie Liu Jinyu Li Yao Qian Zhenglu Yang SSL 161 33 0 16 Dec 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural SpeechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 Suwon Shon Ankita Pasad Felix Wu Pablo Brusco Yoav Artzi Karen Livescu Kyu Jeong Han AuLLM ELM 219 90 0 19 Nov 2021
Membership Inference Attacks Against Self-supervised Speech ModelsInterspeech (Interspeech), 2021 Wei-Cheng Tseng Wei-Tsung Kao Hung-yi Lee 319 17 0 09 Nov 2021
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding Yingzhi Wang Abdelmoumene Boumadane A. Heba 223 181 0 04 Nov 2021
Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021 Jinyu Li VLM 350 422 0 02 Nov 2021
Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 Chak-Fai Li Francis Keith William Hartmann M. Snover SSL 113 2 0 29 Oct 2021
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech ReconstructionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 Heming Wang Yao Qian Xiaofei Wang Yiming Wang Chengyi Wang Shujie Liu Takuya Yoshioka Jinyu Li DeLiang Wang 203 33 0 28 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing Sanyuan Chen Chengyi Wang Zhengyang Chen Yu-Huan Wu Shujie Liu ... Yao Qian Jian Wu Micheal Zeng Xiangzhan Yu Furu Wei SSL 735 2,571 0 26 Oct 2021
SSAST: Self-Supervised Audio Spectrogram Transformer Yuan Gong Cheng-I Jeff Lai Yu-An Chung James R. Glass ViT 275 350 0 19 Oct 2021