ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.03240
  4. Cited By
An Unsupervised Autoregressive Model for Speech Representation Learning

An Unsupervised Autoregressive Model for Speech Representation Learning

5 April 2019
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
    SSL
ArXivPDFHTML

Papers citing "An Unsupervised Autoregressive Model for Speech Representation Learning"

50 / 102 papers shown
Title
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
Hyunji Lee
Danni Liu
Supriti Sinhamahapatra
Jan Niehues
106
0
0
21 Feb 2025
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
31
1
0
09 Sep 2024
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
47
0
0
20 Aug 2024
SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
Siavash Shams
Sukru Samet Dindar
Xilin Jiang
N. Mesgarani
Mamba
64
18
0
20 May 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
StreamVoice: Streamable Context-Aware Language Modeling for Real-time
  Zero-Shot Voice Conversion
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
Zhichao Wang
Yuan-Jui Chen
Xinsheng Wang
Lei Xie
Yuping Wang
22
6
0
19 Jan 2024
Self-Supervised Learning for Audio-Based Emotion Recognition
Self-Supervised Learning for Audio-Based Emotion Recognition
Peranut Nimitsurachat
Peter Washington
25
3
0
23 Jul 2023
On-Device Constrained Self-Supervised Speech Representation Learning for
  Keyword Spotting via Knowledge Distillation
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
Gene-Ping Yang
Yue Gu
Qingming Tang
Dongsu Du
Yuzong Liu
20
5
0
06 Jul 2023
Self-supervised Predictive Coding Models Encode Speaker and Phonetic
  Information in Orthogonal Subspaces
Self-supervised Predictive Coding Models Encode Speaker and Phonetic Information in Orthogonal Subspaces
Oli Danyi Liu
Hao Tang
Sharon Goldwater
SSL
25
12
0
21 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised
  Speech Representation Learning
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
22
24
0
17 May 2023
Accommodating Audio Modality in CLIP for Multimodal Processing
Accommodating Audio Modality in CLIP for Multimodal Processing
Ludan Ruan
Anwen Hu
Yuqing Song
Liang Zhang
S. Zheng
Qin Jin
VLM
18
10
0
12 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature
  Diversity and Decorrelation
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
42
3
0
07 Mar 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech
  Recognition
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Will Rieger
BDL
UQCV
13
0
0
16 Jan 2023
Context-aware Fine-tuning of Self-supervised Speech Models
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
25
7
0
16 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of
  the art analysis
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
28
21
0
01 Dec 2022
CHAPTER: Exploiting Convolutional Neural Network Adapters for
  Self-supervised Speech Models
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models
Zih-Ching Chen
Yu-Shun Sung
Hung-yi Lee
21
16
0
01 Dec 2022
Compressing Transformer-based self-supervised models for speech
  processing
Compressing Transformer-based self-supervised models for speech processing
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
34
6
0
17 Nov 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
32
13
0
17 Nov 2022
Improving Children's Speech Recognition by Fine-tuning Self-supervised
  Adult Speech Representations
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Renée Lu
M. Shahin
Beena Ahmed
27
4
0
14 Nov 2022
Investigating Enhancements to Contrastive Predictive Coding for Human
  Activity Recognition
Investigating Enhancements to Contrastive Predictive Coding for Human Activity Recognition
H. Haresamudram
Irfan Essa
Thomas Ploetz
AI4TS
30
15
0
11 Nov 2022
Improved acoustic-to-articulatory inversion using representations from
  pretrained self-supervised learning models
Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models
Sathvik Udupa
Siddarth C
P. Ghosh
19
7
0
30 Oct 2022
Learning Dependencies of Discrete Speech Representations with Neural
  Hidden Markov Models
Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models
Sung-Lin Yeh
Hao Tang
SSL
BDL
22
1
0
29 Oct 2022
FedAudio: A Federated Learning Benchmark for Audio Tasks
FedAudio: A Federated Learning Benchmark for Audio Tasks
Tuo Zhang
Tiantian Feng
Samiul Alam
Sunwoo Lee
Mi Zhang
Shrikanth S. Narayanan
Salman Avestimehr
FedML
25
23
0
27 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of
  Self-Supervised Speech Representation Learning
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
26
33
0
16 Oct 2022
CTCBERT: Advancing Hidden-unit BERT with CTC Objectives
CTCBERT: Advancing Hidden-unit BERT with CTC Objectives
Ruchao Fan
Yiming Wang
Yashesh Gaur
Jinyu Li
36
7
0
16 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
On the Utility of Self-supervised Models for Prosody-related Tasks
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen An Li
Hung-yi Lee
Nigel G. Ward
21
47
0
13 Oct 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
25
21
0
20 Jul 2022
FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised
  Learning Features in Robust End-to-end Speech Recognition
FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Szu-Jui Chen
Jiamin Xie
John H. L. Hansen
35
8
0
30 Jun 2022
Predicting within and across language phoneme recognition performance of
  self-supervised learning speech pre-trained models
Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models
Han Ji
T. Patel
O. Scharenborg
29
7
0
24 Jun 2022
Investigation of Ensemble features of Self-Supervised Pretrained Models
  for Automatic Speech Recognition
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition
Anjana Arunkumar
Vrunda N. Sukhadia
S. Umesh
25
10
0
11 Jun 2022
Speak Like a Dog: Human to Non-human creature Voice Conversion
Speak Like a Dog: Human to Non-human creature Voice Conversion
Kohei Suzuki
Shoki Sakamoto
T. Taniguchi
Hirokazu Kameoka
19
2
0
09 Jun 2022
Joint Encoder-Decoder Self-Supervised Pre-training for ASR
Joint Encoder-Decoder Self-Supervised Pre-training for ASR
Arunkumar A
S. Umesh
SSL
34
8
0
09 Jun 2022
Self-supervised models of audio effectively explain human cortical
  responses to speech
Self-supervised models of audio effectively explain human cortical responses to speech
Aditya R. Vaidya
Shailee Jain
Alexander G. Huth
25
42
0
27 May 2022
Contrastive Siamese Network for Semi-supervised Speech Recognition
Contrastive Siamese Network for Semi-supervised Speech Recognition
S. Khorram
Jaeyoung Kim
Anshuman Tripathi
Han Lu
Qian Zhang
Hasim Sak
SSL
8
11
0
27 May 2022
Joint Training of Speech Enhancement and Self-supervised Model for
  Noise-robust ASR
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Qiu-shi Zhu
Jie M. Zhang
Zitian Zhang
Lirong Dai
35
15
0
26 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to
  Store Speaker Information
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
20
8
0
08 May 2022
Sound Localization by Self-Supervised Time Delay Estimation
Sound Localization by Self-Supervised Time Delay Estimation
Ziyang Chen
David Fouhey
Andrew Owens
SSL
19
19
0
26 Apr 2022
On-demand compute reduction with stochastic wav2vec 2.0
On-demand compute reduction with stochastic wav2vec 2.0
Apoorv Vyas
Wei-Ning Hsu
Michael Auli
Alexei Baevski
24
13
0
25 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
21
110
0
20 Apr 2022
User-Level Differential Privacy against Attribute Inference Attack of
  Speech Emotion Recognition in Federated Learning
User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition in Federated Learning
Tiantian Feng
Raghuveer Peri
Shrikanth Narayanan
FedML
13
28
0
05 Apr 2022
Repeat after me: Self-supervised learning of acoustic-to-articulatory
  mapping by vocal imitation
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Marc-Antoine Georges
Julien Diard
Laurent Girin
J. Schwartz
Thomas Hueber
6
7
0
05 Apr 2022
Autoregressive Co-Training for Learning Discrete Speech Representations
Autoregressive Co-Training for Learning Discrete Speech Representations
Sung-Lin Yeh
Hao Tang
SSL
16
6
0
29 Mar 2022
Federated Self-Supervised Learning for Acoustic Event Classification
Federated Self-Supervised Learning for Acoustic Event Classification
Meng Feng
Chieh-Chi Kao
Qingming Tang
Ming Sun
Viktor Rozgic
Spyros Matsoukas
Chao Wang
26
11
0
22 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with
  Sparse Sharing Sub-networks
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
21
19
0
09 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised
  Pre-Training
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Ramon Sanabria
Wei-Ning Hsu
Alexei Baevski
Michael Auli
19
7
0
01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
19
11
0
01 Mar 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and
  Self-Supervised Scoring
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
Herman Kamper
21
25
0
24 Feb 2022
Assessing the State of Self-Supervised Human Activity Recognition using
  Wearables
Assessing the State of Self-Supervised Human Activity Recognition using Wearables
H. Haresamudram
Irfan Essa
Thomas Plötz
SSL
34
85
0
22 Feb 2022
123
Next