Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2001.10603
Cited By
v1
v2 (latest)
Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
28 January 2020
Weiran Wang
Qingming Tang
Karen Livescu
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction"
50 / 64 papers shown
Towards the Next Frontier in Speech Representation Learning Using Disentanglement
Varun Krishna
Sriram Ganapathy
SSL
383
2
0
02 Jul 2024
CochCeps-Augment: A Novel Self-Supervised Contrastive Learning Using Cochlear Cepstrum-based Masking for Speech Emotion Recognition
Ioannis Ziogas
Hessa Alfalahi
A. Khandoker
L. Hadjileontiadis
155
1
0
10 Feb 2024
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
Afra Alishahi
262
19
0
15 Oct 2023
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
Interspeech (Interspeech), 2023
Gene-Ping Yang
Yue Gu
Qingming Tang
Dongsu Du
Yuzong Liu
201
7
0
06 Jul 2023
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
530
382
0
24 Apr 2023
Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature Fusion
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhouyuan Huo
K. Sim
Yue Liu
DongSeon Hwang
Tara N. Sainath
Trevor Strohman
214
8
0
04 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Neural Information Processing Systems (NeurIPS), 2022
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
464
10
0
02 Nov 2022
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach
International Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022
Xulong Zhang
Jianzong Wang
Ning Cheng
Kexin Zhu
Jing Xiao
255
1
0
25 Oct 2022
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR
Interspeech (Interspeech), 2022
Ruchao Fan
Abeer Alwan
281
39
0
16 Jun 2022
Self-Supervised Speech Representation Learning: A Review
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
781
471
0
21 May 2022
On-demand compute reduction with stochastic wav2vec 2.0
Interspeech (Interspeech), 2022
Apoorv Vyas
Wei-Ning Hsu
Michael Auli
Alexei Baevski
260
13
0
25 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
International Conference on Machine Learning (ICML), 2022
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
247
153
0
20 Apr 2022
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Interspeech (Interspeech), 2022
Junyi Ao
Zi-Hua Zhang
Long Zhou
Shujie Liu
Haizhou Li
Tom Ko
Lirong Dai
Jinyu Li
Yao Qian
Furu Wei
SSL
216
20
0
31 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
304
21
0
09 Mar 2022
Compressed Predictive Information Coding
Rui Meng
Tianyi Luo
K. Bouchard
220
3
0
03 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
265
13
0
01 Mar 2022
Assessing the State of Self-Supervised Human Activity Recognition using Wearables
Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2022
H. Haresamudram
Irfan Essa
Thomas Plötz
SSL
475
125
0
22 Feb 2022
RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Efthymios Tzinis
Yossi Adi
V. Ithapu
Buye Xu
Paris Smaragdis
Anurag Kumar
CLL
320
71
0
17 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
International Conference on Learning Representations (ICLR), 2022
Wenyong Huang
Zhenhe Zhang
Y. Yeung
Xin Jiang
Qun Liu
327
29
0
25 Jan 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qiu-shi Zhu
Jie Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
364
52
0
22 Jan 2022
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
234
37
0
16 Dec 2021
Lacuna Reconstruction: Self-supervised Pre-training for Low-Resource Historical Document Transcription
Nikolai Vogler
J. Allen
M. Miller
Taylor Berg-Kirkpatrick
153
6
0
16 Dec 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Suwon Shon
Ankita Pasad
Felix Wu
Pablo Brusco
Yoav Artzi
Karen Livescu
Kyu Jeong Han
AuLLM
ELM
328
93
0
19 Nov 2021
Joint Unsupervised and Supervised Training for Multilingual ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Junwen Bai
Yue Liu
Yu Zhang
Ankur Bapna
Nikhil Siddhartha
K. Sim
Tara N. Sainath
332
64
0
15 Nov 2021
Textless Speech Emotion Conversion using Discrete and Decomposed Representations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Felix Kreuk
Adam Polyak
Jade Copet
Eugene Kharitonov
Tu Nguyen
M. Rivière
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
Yossi Adi
389
47
0
14 Nov 2021
TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining
Viet-Anh Nguyen
Anh H. T. Nguyen
Andy W. H. Khong
252
29
0
26 Oct 2021
Contrastively Disentangled Sequential Variational Autoencoder
Neural Information Processing Systems (NeurIPS), 2021
M. Kiener
Weiran Wang
Michael Gerndt
CoGe
DRL
257
56
0
22 Oct 2021
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Automatic Speech Recognition & Understanding (ASRU), 2021
Yu-An Chung
Yu Zhang
Wei Han
Chung-Cheng Chiu
James Qin
Ruoming Pang
Yonghui Wu
SSL
VLM
343
522
0
07 Aug 2021
Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing
Benjamin van Niekerk
Leanne Nortje
Matthew Baas
Herman Kamper
SSL
311
34
0
02 Aug 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model
Automatic Speech Recognition & Understanding (ASRU), 2021
Ankita Pasad
Ju-Chieh Chou
Karen Livescu
SSL
544
414
0
10 Jul 2021
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Interspeech (Interspeech), 2021
Jian Luo
Jianzong Wang
Ning Cheng
Jing Xiao
SSL
224
6
0
09 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
Shammur A. Chowdhury
Nadir Durrani
Ahmed M. Ali
431
21
0
01 Jul 2021
Low Resource German ASR with Untranscribed Data Spoken by Non-native Children -- INTERSPEECH 2021 Shared Task SPAPL System
Interspeech (Interspeech), 2021
Jinhan Wang
Yunzheng Zhu
Ruchao Fan
Wei Chu
Abeer Alwan
159
8
0
18 Jun 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Wei-Ning Hsu
Benjamin Bolte
Yifan Hao
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
740
4,354
0
14 Jun 2021
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Neural Information Processing Systems (NeurIPS), 2021
Cheng-I Jeff Lai
Yang Zhang
Alexander H. Liu
Shiyu Chang
Yi-Lun Liao
Yung-Sung Chuang
Kaizhi Qian
Sameer Khurana
David D. Cox
James R. Glass
VLM
332
87
0
10 Jun 2021
Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Marcely Zanon Boito
Bolaji Yusuf
Lucas Ondel
Aline Villavicencio
Laurent Besacier
184
4
0
08 Jun 2021
Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model
Interspeech (Interspeech), 2021
Apoorv Vyas
S. Madikeri
H. Bourlard
175
16
0
06 Apr 2021
Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptation
Spoken Language Technology Workshop (SLT), 2021
C. Jacobs
Yevgen Matusevych
Herman Kamper
328
25
0
19 Mar 2021
Improving speech recognition models with small samples for air traffic control systems
Neurocomputing (Neurocomputing), 2021
Yi Lin
Qin Li
Bo Yang
Zhen Yan
Huachun Tan
Zhengmao Chen
249
33
0
16 Feb 2021
Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Ruchao Fan
Amber Afshan
Abeer Alwan
202
14
0
12 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Transactions of the Association for Computational Linguistics (TACL), 2021
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
774
458
0
01 Feb 2021
On Scaling Contrastive Representations for Low-Resource Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Lasse Borgholt
T. M. S. Tax
Jakob Drachmann Havtorn
Lars Maaløe
Christian Igel
SSL
198
5
0
01 Feb 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
220
74
0
31 Dec 2020
Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Apoorv Vyas
S. Madikeri
H. Bourlard
194
14
0
28 Dec 2020
Sequence-to-Sequence Contrastive Learning for Text Recognition
Computer Vision and Pattern Recognition (CVPR), 2020
Aviad Aberdam
Ron Litman
Shahar Tsiper
Oron Anschel
Ron Slossberg
Shai Mazor
R. Manmatha
Pietro Perona
383
131
0
20 Dec 2020
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization
Shaoshi Ling
Yuzong Liu
208
114
0
11 Dec 2020
Contrastive Predictive Coding for Human Activity Recognition
Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2020
H. Haresamudram
Irfan Essa
Thomas Ploetz
436
149
0
09 Dec 2020
The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling
Tu Nguyen
Maureen de Seyssel
Patricia Roze
M. Rivière
Evgeny Kharitonov
Alexei Baevski
Ewan Dunbar
Emmanuel Dupoux
SSL
462
132
0
23 Nov 2020
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
Interspeech (Interspeech), 2020
Alexander H. Liu
Yu-An Chung
James R. Glass
SSL
277
94
0
01 Nov 2020
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Interspeech (Interspeech), 2020
Dongwei Jiang
Wubo Li
Miao Cao
Wei Zou
Xiangang Li
SSL
411
74
0
27 Oct 2020
1
2
Next
Page 1 of 2