Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.08392
Cited By
Vector-Quantized Autoregressive Predictive Coding
17 May 2020
Yu-An Chung
Hao Tang
James R. Glass
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Vector-Quantized Autoregressive Predictive Coding"
50 / 79 papers shown
Title
Text-Queried Audio Source Separation via Hierarchical Modeling
Xinlei Yin
Xiulian Peng
Xue Jiang
Zhiwei Xiong
Yan Lu
46
0
0
27 May 2025
Causal Self-supervised Pretrained Frontend with Predictive Code for Speech Separation
Wupeng Wang
Zexu Pan
Xianrui Li
Shuai Wang
Haizhou Li
AI4TS
75
0
0
03 Apr 2025
Deep Neural Networks and Brain Alignment: Brain Encoding and Decoding (Survey)
Subba Reddy Oota
Zijiao Chen
Manish Gupta
R. Bapi
G. Jobard
F. Alexandre
X. Hinaut
3DV
AI4CE
145
15
0
31 Dec 2024
Representation Collapsing Problems in Vector Quantization
Wenhao Zhao
Qiran Zou
Rushi Shah
Dianbo Liu
109
2
0
25 Nov 2024
Speech Separation with Pretrained Frontend to Minimize Domain Mismatch
Wupeng Wang
Zexu Pan
Xianrui Li
Shuai Wang
Haoyang Li
78
4
0
05 Nov 2024
Stimulus Modality Matters: Impact of Perceptual Evaluations from Different Modalities on Speech Emotion Recognition System Performance
Huang-Cheng Chou
Haibin Wu
Chi-Chun Lee
91
2
0
16 Sep 2024
Estimating the Completeness of Discrete Speech Units
Sung-Lin Yeh
Hao Tang
105
2
0
09 Sep 2024
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
119
3
0
09 Sep 2024
Refining Self-Supervised Learnt Speech Representation using Brain Activations
Hengyu Li
Kangdi Mei
Zhaoci Liu
Yang Ai
Liping Chen
Jie Zhang
Zhenhua Ling
SSL
87
1
0
12 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
99
27
0
15 Apr 2024
EMO-SUPERB: An In-depth Look at Speech Emotion Recognition
Haibin Wu
Huang-Cheng Chou
Kai-Wei Chang
Lucas Goncalves
Jiawei Du
Jyh-Shing Roger Jang
Chi-Chun Lee
Hung-Yi Lee
91
15
0
20 Feb 2024
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
Calum Heggan
S. Budgett
Timothy M. Hospedales
Mehrdad Yaghoobi
SSL
77
1
0
02 Feb 2024
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
Alexander H. Liu
Sung-Lin Yeh
James R. Glass
SSL
66
3
0
16 Jan 2024
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors
Shuyue Stella Li
Beining Xu
Xiangyu Zhang
Hexin Liu
Wen-Han Chao
Leibny Paola García
SSL
56
4
0
27 Nov 2023
Speech language models lack important brain-relevant semantics
Subba Reddy Oota
Emin cCelik
Fatma Deniz
Mariya Toneva
69
11
0
08 Nov 2023
Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech Models
Asad Ullah
Alessandro Ragano
Andrew Hines
156
1
0
22 Sep 2023
Knowledge Distillation from Non-streaming to Streaming ASR Encoder using Auxiliary Non-streaming Layer
Kyuhong Shim
Jinkyu Lee
Simyoung Chang
Kyuwoong Hwang
78
3
0
31 Aug 2023
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation
Ziyang Ma
Zhisheng Zheng
Guanrou Yang
Yu Wang
Chuxu Zhang
Xie Chen
SSL
72
9
0
15 Jun 2023
Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity Recognition
H. Haresamudram
Irfan Essa
Thomas Ploetz
98
8
0
01 Jun 2023
Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners
Sarthak Yadav
Sergios Theodoridis
Lars Kai Hansen
Zheng-Hua Tan
92
9
0
01 Jun 2023
Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?
Eklavya Sarkar
Mathew Magimai.-Doss
49
12
0
23 May 2023
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering
Heng-Jui Chang
Alexander H. Liu
James R. Glass
SSL
84
21
0
18 May 2023
Speech Separation based on Contrastive Learning and Deep Modularization
Peter Ochieng
SSL
85
0
0
18 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
75
26
0
17 May 2023
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks
Minyoung Huh
Brian Cheung
Pulkit Agrawal
Phillip Isola
MQ
54
54
0
15 May 2023
AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Jiachen Lian
Alexei Baevski
Wei-Ning Hsu
Michael Auli
SSL
150
34
0
10 Feb 2023
Dual Learning for Large Vocabulary On-Device ASR
Cal Peyser
Ronny Huang
Tara N. Sainath
Rohit Prabhavalkar
M. Picheny
K. Cho
SSL
49
1
0
11 Jan 2023
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
119
23
0
01 Dec 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
92
16
0
17 Nov 2022
Introducing Semantics into Speech Encoders
Derek Xu
Shuyan Dong
Changhan Wang
Suyoun Kim
Zhaojiang Lin
...
Alexei Baevski
Guan-Ting Lin
Hung-yi Lee
Yizhou Sun
Wei Wang
SSL
103
3
0
15 Nov 2022
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Renée Lu
M. Shahin
Beena Ahmed
57
4
0
14 Nov 2022
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Ziyang Ma
Zhisheng Zheng
Changli Tang
Yujin Wang
Xie Chen
122
20
0
14 Nov 2022
A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Yifan Peng
Siddhant Arora
Yosuke Higuchi
Yushi Ueda
Sujay S. Kumar
Karthik Ganesan
Siddharth Dalmia
Xuankai Chang
Shinji Watanabe
75
21
0
10 Nov 2022
Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models
Sung-Lin Yeh
Hao Tang
SSL
BDL
57
1
0
29 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
102
35
0
16 Oct 2022
CTCBERT: Advancing Hidden-unit BERT with CTC Objectives
Ruchao Fan
Yiming Wang
Yashesh Gaur
Jinyu Li
103
8
0
16 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Nigel G. Ward
63
51
0
13 Oct 2022
Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora
Yuanchao Li
Yumnah Mohamied
P. Bell
Catherine Lai
SSL
116
47
0
05 Oct 2022
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Layne Berry
Hung-yi Lee
David Harwath
VLM
CLIP
137
32
0
03 Oct 2022
SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Zi-Hua Zhang
Sanyuan Chen
Long Zhou
Yu Wu
Shuo Ren
...
Zhuoyuan Yao
Xun Gong
Lirong Dai
Jinyu Li
Furu Wei
79
57
0
30 Sep 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations
Tung-Yu Wu
Chen-An Li
Tzu-Han Lin
Tsung-Yuan Hsu
Hung-yi Lee
64
5
0
26 Sep 2022
End-to-End Lyrics Recognition with Self-supervised Learning
Xiangyu Zhang
Shuyue Stella Li
Zhanhong He
R. Togneri
Leibny Paola García
50
0
0
26 Sep 2022
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Tomoki Toda
66
17
0
10 Jul 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
287
368
0
21 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
86
8
0
08 May 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
94
108
0
07 Apr 2022
Autoregressive Co-Training for Learning Discrete Speech Representations
Sung-Lin Yeh
Hao Tang
SSL
87
6
0
29 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
91
110
0
14 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
102
109
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
92
11
0
01 Mar 2022
1
2
Next