ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.03240
  4. Cited By
An Unsupervised Autoregressive Model for Speech Representation Learning
v1v2 (latest)

An Unsupervised Autoregressive Model for Speech Representation Learning

5 April 2019
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
    SSL
ArXiv (abs)PDFHTML

Papers citing "An Unsupervised Autoregressive Model for Speech Representation Learning"

50 / 269 papers shown
Title
Boosting Cross-Domain Speech Recognition with Self-Supervision
Boosting Cross-Domain Speech Recognition with Self-SupervisionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Hanjing Zhu
Gaofeng Cheng
Yongfeng Zhang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
262
21
0
20 Jun 2022
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised
  Learning and Its Application to Children's ASR
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASRInterspeech (Interspeech), 2022
Ruchao Fan
Abeer Alwan
169
37
0
16 Jun 2022
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and
  Future Directions
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future DirectionsACM Computing Surveys (ACM CSUR), 2022
Sheng Zhou
Hongjia Xu
Zhuonan Zheng
Jiawei Chen
Zhao Li
Jiajun Bu
Jia Wu
Xin Eric Wang
Wenwu Zhu
Martin Ester
208
148
0
15 Jun 2022
Investigation of Ensemble features of Self-Supervised Pretrained Models
  for Automatic Speech Recognition
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech RecognitionInterspeech (Interspeech), 2022
Anjana Arunkumar
Vrunda N. Sukhadia
S. Umesh
146
17
0
11 Jun 2022
Speak Like a Dog: Human to Non-human creature Voice Conversion
Speak Like a Dog: Human to Non-human creature Voice ConversionAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022
Kohei Suzuki
Shoki Sakamoto
T. Taniguchi
Hirokazu Kameoka
107
4
0
09 Jun 2022
Joint Encoder-Decoder Self-Supervised Pre-training for ASR
Joint Encoder-Decoder Self-Supervised Pre-training for ASRInterspeech (Interspeech), 2022
Arunkumar A
S. Umesh
SSL
112
9
0
09 Jun 2022
Speech Augmentation Based Unsupervised Learning for Keyword Spotting
Speech Augmentation Based Unsupervised Learning for Keyword SpottingIEEE International Joint Conference on Neural Network (IJCNN), 2022
Jian Luo
Jianzong Wang
Ning Cheng
Haobin Tang
Jing Xiao
SSL
140
2
0
28 May 2022
Self-supervised models of audio effectively explain human cortical
  responses to speech
Self-supervised models of audio effectively explain human cortical responses to speechInternational Conference on Machine Learning (ICML), 2022
Aditya R. Vaidya
Shailee Jain
Alexander G. Huth
177
68
0
27 May 2022
Contrastive Siamese Network for Semi-supervised Speech Recognition
Contrastive Siamese Network for Semi-supervised Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
S. Khorram
Jaeyoung Kim
Anshuman Tripathi
Han Lu
Qian Zhang
Hasim Sak
SSL
180
16
0
27 May 2022
Joint Training of Speech Enhancement and Self-supervised Model for
  Noise-robust ASR
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Qiu-shi Zhu
Jie Zhang
Zitian Zhang
Lirong Dai
170
18
0
26 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A ReviewIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSLAI4TS
578
435
0
21 May 2022
Deploying self-supervised learning in the wild for hybrid automatic
  speech recognition
Deploying self-supervised learning in the wild for hybrid automatic speech recognition
Mostafa Karimi
Changliang Liu
K. Kumatani
Yao Qian
Tianyu Wu
Jian Wu
121
3
0
17 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to
  Store Speaker Information
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
142
9
0
08 May 2022
Sound Localization by Self-Supervised Time Delay Estimation
Sound Localization by Self-Supervised Time Delay EstimationEuropean Conference on Computer Vision (ECCV), 2022
Ziyang Chen
David Fouhey
Andrew Owens
SSL
198
23
0
26 Apr 2022
On-demand compute reduction with stochastic wav2vec 2.0
On-demand compute reduction with stochastic wav2vec 2.0Interspeech (Interspeech), 2022
Apoorv Vyas
Wei-Ning Hsu
Michael Auli
Alexei Baevski
150
13
0
25 Apr 2022
Cross-stitched Multi-modal Encoders
Cross-stitched Multi-modal Encoders
Karan Singla
Daniel Pressel
Ryan Price
Bhargav Srinivas Chinnari
Yeon-Jun Kim
S. Bangalore
135
0
0
20 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling SpeakersInternational Conference on Machine Learning (ICML), 2022
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
161
140
0
20 Apr 2022
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training
  and Distribution of Opinion Scores
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion ScoresInterspeech (Interspeech), 2022
Wei-Cheng Tseng
Wei-Tsung Kao
Hung-yi Lee
277
24
0
07 Apr 2022
User-Level Differential Privacy against Attribute Inference Attack of
  Speech Emotion Recognition in Federated Learning
User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition in Federated LearningInterspeech (Interspeech), 2022
Tiantian Feng
Raghuveer Peri
Shrikanth Narayanan
FedML
174
36
0
05 Apr 2022
Repeat after me: Self-supervised learning of acoustic-to-articulatory
  mapping by vocal imitation
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Marc-Antoine Georges
Julien Diard
Laurent Girin
J. Schwartz
Thomas Hueber
85
7
0
05 Apr 2022
Autoregressive Co-Training for Learning Discrete Speech Representations
Autoregressive Co-Training for Learning Discrete Speech RepresentationsInterspeech (Interspeech), 2022
Sung-Lin Yeh
Hao Tang
SSL
163
6
0
29 Mar 2022
Investigating Self-supervised Pretraining Frameworks for Pathological
  Speech Recognition
Investigating Self-supervised Pretraining Frameworks for Pathological Speech RecognitionInterspeech (Interspeech), 2022
Lester Phillip Violeta
Wen-Chin Huang
Tomoki Toda
209
44
0
29 Mar 2022
Federated Self-Supervised Learning for Acoustic Event Classification
Federated Self-Supervised Learning for Acoustic Event ClassificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Meng Feng
Chieh-Chi Kao
Qingming Tang
Ming Sun
Viktor Rozgic
Spyros Matsoukas
Chao Wang
146
14
0
22 Mar 2022
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On
  Federated Learning using Multiview Pseudo-Labeling
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-LabelingInterspeech (Interspeech), 2022
Tiantian Feng
Shrikanth Narayanan
121
23
0
15 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative CapabilitiesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
243
122
0
14 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with
  Sparse Sharing Sub-networks
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
182
21
0
09 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A SurveyPatterns (Patterns), 2022
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
210
127
0
02 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised
  Pre-Training
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Ramon Sanabria
Wei-Ning Hsu
Alexei Baevski
Michael Auli
195
8
0
01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDLAI4TSSSL
207
13
0
01 Mar 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and
  Self-Supervised Scoring
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised ScoringIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Herman Kamper
231
31
0
24 Feb 2022
Assessing the State of Self-Supervised Human Activity Recognition using
  Wearables
Assessing the State of Self-Supervised Human Activity Recognition using WearablesProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2022
H. Haresamudram
Irfan Essa
Thomas Plötz
SSL
296
112
0
22 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with
  Transfer Learning and Language Model Decoding
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
143
11
0
10 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech,
  Vision and Language
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and LanguageInternational Conference on Machine Learning (ICML), 2022
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSLVLMViT
424
1,017
0
07 Feb 2022
Self-Supervised Representation Learning for Speech Using Visual
  Grounding and Masked Language Modeling
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling
Puyuan Peng
David Harwath
SSL
184
28
0
07 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for
  Automatic Speech Recognition
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Bethan Thomas
Samuel Kessler
S. Karout
131
82
0
07 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Speaker Normalization for Self-supervised Speech Emotion RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
198
60
0
02 Feb 2022
Supervised and Self-supervised Pretraining Based COVID-19 Detection
  Using Acoustic Breathing/Cough/Speech Signals
Supervised and Self-supervised Pretraining Based COVID-19 Detection Using Acoustic Breathing/Cough/Speech SignalsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xing-Yu Chen
Qiu-shi Zhu
Jie Zhang
Lirong Dai
163
16
0
22 Jan 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech
  Representation Learning for Automatic Speech Recognition
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qiu-shi Zhu
Jie Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
277
51
0
22 Jan 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster
  Prediction
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster PredictionInternational Conference on Learning Representations (ICLR), 2022
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
292
406
0
05 Jan 2022
Discrete and continuous representations and processing in deep learning:
  Looking forward
Discrete and continuous representations and processing in deep learning: Looking forwardAI Open (AO), 2022
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
277
28
0
04 Jan 2022
Attribute Inference Attack of Speech Emotion Recognition in Federated
  Learning Settings
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings
Tiantian Feng
H. Hashemi
Rajat Hebbar
M. Annavaram
Shrikanth S. Narayanan
295
29
0
26 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer
  supervision
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
161
33
0
16 Dec 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation
  on Natural Speech
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural SpeechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Suwon Shon
Ankita Pasad
Felix Wu
Pablo Brusco
Yoav Artzi
Karen Livescu
Kyu Jeong Han
AuLLMELM
219
90
0
19 Nov 2021
Membership Inference Attacks Against Self-supervised Speech Models
Membership Inference Attacks Against Self-supervised Speech ModelsInterspeech (Interspeech), 2021
Wei-Cheng Tseng
Wei-Tsung Kao
Hung-yi Lee
319
17
0
09 Nov 2021
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion
  Recognition, Speaker Verification and Spoken Language Understanding
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
223
181
0
04 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
350
422
0
02 Nov 2021
Combining Unsupervised and Text Augmented Semi-Supervised Learning for
  Low Resourced Autoregressive Speech Recognition
Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
SSL
113
2
0
29 Oct 2021
Improving Noise Robustness of Contrastive Speech Representation Learning
  with Speech Reconstruction
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech ReconstructionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Heming Wang
Yao Qian
Xiaofei Wang
Yiming Wang
Chengyi Wang
Shujie Liu
Takuya Yoshioka
Jinyu Li
DeLiang Wang
203
33
0
28 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
735
2,571
0
26 Oct 2021
SSAST: Self-Supervised Audio Spectrogram Transformer
SSAST: Self-Supervised Audio Spectrogram Transformer
Yuan Gong
Cheng-I Jeff Lai
Yu-An Chung
James R. Glass
ViT
275
350
0
19 Oct 2021
Previous
123456
Next