v1v2 (latest)

Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

28 January 2020

Papers citing "Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction"

50 / 64 papers shown

Towards the Next Frontier in Speech Representation Learning Using Disentanglement

Varun Krishna

Sriram Ganapathy

SSL

383

02 Jul 2024

CochCeps-Augment: A Novel Self-Supervised Contrastive Learning Using Cochlear Cepstrum-based Masking for Speech Emotion Recognition

155

10 Feb 2024

Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers

262

15 Oct 2023

On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge DistillationInterspeech (Interspeech), 2023

201

06 Jul 2023

A Cookbook of Self-Supervised Learning

...

Pierre Fernandez

530

382

24 Apr 2023

Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature FusionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

214

04 Nov 2022

Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech ProcessingNeural Information Processing Systems (NeurIPS), 2022

Kaizhi Qian

464

02 Nov 2022

Improving Speech Representation Learning via Speech-level and Phoneme-level Masking ApproachInternational Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022

255

25 Oct 2022

DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASRInterspeech (Interspeech), 2022

Ruchao Fan

Abeer Alwan

281

16 Jun 2022

Self-Supervised Speech Representation Learning: A ReviewIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022

Abdel-rahman Mohamed

Hung-yi Lee

Lasse Borgholt

Jakob Drachmann Havtorn

...

781

471

21 May 2022

On-demand compute reduction with stochastic wav2vec 2.0Interspeech (Interspeech), 2022

260

25 Apr 2022

ContentVec: An Improved Self-Supervised Speech Representation by Disentangling SpeakersInternational Conference on Machine Learning (ICML), 2022

Kaizhi Qian

247

153

20 Apr 2022

Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech DataInterspeech (Interspeech), 2022

Haizhou Li

216

31 Mar 2022

Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

304

09 Mar 2022

Compressed Predictive Information Coding

Rui Meng

Tianyi Luo

K. Bouchard

220

03 Mar 2022

A Brief Overview of Unsupervised Neural Speech Representation Learning

Lasse Borgholt

Jakob Drachmann Havtorn

265

01 Mar 2022

Assessing the State of Self-Supervised Human Activity Recognition using WearablesProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2022

475

125

22 Feb 2022

RemixIT: Continual self-training of speech enhancement models via bootstrapped remixingIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022

Yossi Adi

320

17 Feb 2022

SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-TrainingInternational Conference on Learning Representations (ICLR), 2022

Wenyong Huang

Zhenhe Zhang

Y. Yeung

Xin Jiang

Qun Liu

327

25 Jan 2022

A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

364

22 Jan 2022

Self-Supervised Learning for speech recognition with Intermediate layer supervision

234

16 Dec 2021

Lacuna Reconstruction: Self-supervised Pre-training for Low-Resource Historical Document Transcription

Nikolai Vogler

J. Allen

M. Miller

Taylor Berg-Kirkpatrick

153

16 Dec 2021

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural SpeechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

328

19 Nov 2021

Joint Unsupervised and Supervised Training for Multilingual ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

332

15 Nov 2021

Textless Speech Emotion Conversion using Discrete and Decomposed RepresentationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Yossi Adi

389

14 Nov 2021

TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining

Viet-Anh Nguyen

Anh H. T. Nguyen

Andy W. H. Khong

252

26 Oct 2021

Contrastively Disentangled Sequential Variational AutoencoderNeural Information Processing Systems (NeurIPS), 2021

257

22 Oct 2021

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-TrainingAutomatic Speech Recognition & Understanding (ASRU), 2021

343

522

07 Aug 2021

Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing

311

02 Aug 2021

Layer-wise Analysis of a Self-supervised Speech Representation ModelAutomatic Speech Recognition & Understanding (ASRU), 2021

544

414

10 Jul 2021

Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech RepresentationInterspeech (Interspeech), 2021

224

09 Jul 2021

What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis

Shammur A. Chowdhury

Nadir Durrani

Ahmed M. Ali

431

01 Jul 2021

Low Resource German ASR with Untranscribed Data Spoken by Non-native Children -- INTERSPEECH 2021 Shared Task SPAPL SystemInterspeech (Interspeech), 2021

159

18 Jun 2021

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden UnitsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021

740

4,354

14 Jun 2021

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech RecognitionNeural Information Processing Systems (NeurIPS), 2021

Kaizhi Qian

332

10 Jun 2021

Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings

184

08 Jun 2021

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic modelInterspeech (Interspeech), 2021

Apoorv Vyas

S. Madikeri

H. Bourlard

175

06 Apr 2021

Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptationSpoken Language Technology Workshop (SLT), 2021

C. Jacobs

Yevgen Matusevych

Herman Kamper

328

19 Mar 2021

Improving speech recognition models with small samples for air traffic control systemsNeurocomputing (Neurocomputing), 2021

249

16 Feb 2021

Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Ruchao Fan

Amber Afshan

Abeer Alwan

202

12 Feb 2021

Generative Spoken Language Modeling from Raw AudioTransactions of the Association for Computational Linguistics (TACL), 2021

Yossi Adi

...

774

458

01 Feb 2021

On Scaling Contrastive Representations for Low-Resource Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Lasse Borgholt

T. M. S. Tax

Jakob Drachmann Havtorn

Lars Maaløe

Christian Igel

SSL

198

01 Feb 2021

Text-Free Image-to-Speech Synthesis Using Learned Segmental UnitsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

220

31 Dec 2020

Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

Apoorv Vyas

S. Madikeri

H. Bourlard

194

28 Dec 2020

Sequence-to-Sequence Contrastive Learning for Text RecognitionComputer Vision and Pattern Recognition (CVPR), 2020

383

131

20 Dec 2020

DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization

Shaoshi Ling

Yuzong Liu

208

114

11 Dec 2020

Contrastive Predictive Coding for Human Activity RecognitionProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2020

H. Haresamudram

Irfan Essa

Thomas Ploetz

436

149

09 Dec 2020

The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling

462

132

23 Nov 2020

Non-Autoregressive Predictive Coding for Learning Speech Representations from Local DependenciesInterspeech (Interspeech), 2020

277

01 Nov 2020

Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation LearningInterspeech (Interspeech), 2020

Dongwei Jiang

Wubo Li

Miao Cao

Wei Zou

Xiangang Li

SSL

411

27 Oct 2020