ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.10603
  4. Cited By
Unsupervised Pre-training of Bidirectional Speech Encoders via Masked
  Reconstruction
v1v2 (latest)

Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction

28 January 2020
Weiran Wang
Qingming Tang
Karen Livescu
    SSL
ArXiv (abs)PDFHTML

Papers citing "Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction"

50 / 64 papers shown
Title
Towards the Next Frontier in Speech Representation Learning Using
  Disentanglement
Towards the Next Frontier in Speech Representation Learning Using Disentanglement
Varun Krishna
Sriram Ganapathy
SSL
60
1
0
02 Jul 2024
CochCeps-Augment: A Novel Self-Supervised Contrastive Learning Using
  Cochlear Cepstrum-based Masking for Speech Emotion Recognition
CochCeps-Augment: A Novel Self-Supervised Contrastive Learning Using Cochlear Cepstrum-based Masking for Speech Emotion Recognition
Ioannis Ziogas
Hessa Alfalahi
A. Khandoker
L. Hadjileontiadis
54
0
0
10 Feb 2024
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech
  Transformers
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
Afra Alishahi
76
13
0
15 Oct 2023
On-Device Constrained Self-Supervised Speech Representation Learning for
  Keyword Spotting via Knowledge Distillation
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
Gene-Ping Yang
Yue Gu
Qingming Tang
Dongsu Du
Yuzong Liu
55
5
0
06 Jul 2023
A Cookbook of Self-Supervised Learning
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDaFedMLSSL
154
284
0
24 Apr 2023
Resource-Efficient Transfer Learning From Speech Foundation Model Using
  Hierarchical Feature Fusion
Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature Fusion
Zhouyuan Huo
K. Sim
Yue Liu
DongSeon Hwang
Tara N. Sainath
Trevor Strohman
67
6
0
04 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
168
9
0
02 Nov 2022
Improving Speech Representation Learning via Speech-level and
  Phoneme-level Masking Approach
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach
Xulong Zhang
Jianzong Wang
Ning Cheng
Kexin Zhu
Jing Xiao
58
1
0
25 Oct 2022
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised
  Learning and Its Application to Children's ASR
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR
Ruchao Fan
Abeer Alwan
88
30
0
16 Jun 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSLAI4TS
285
368
0
21 May 2022
On-demand compute reduction with stochastic wav2vec 2.0
On-demand compute reduction with stochastic wav2vec 2.0
Apoorv Vyas
Wei-Ning Hsu
Michael Auli
Alexei Baevski
66
13
0
25 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
68
113
0
20 Apr 2022
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired
  Speech Data
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Junyi Ao
Zi-Hua Zhang
Long Zhou
Shujie Liu
Haizhou Li
Tom Ko
Lirong Dai
Jinyu Li
Yao Qian
Furu Wei
SSL
77
19
0
31 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with
  Sparse Sharing Sub-networks
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
84
19
0
09 Mar 2022
Compressed Predictive Information Coding
Compressed Predictive Information Coding
Rui Meng
Tianyi Luo
K. Bouchard
49
2
0
03 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDLAI4TSSSL
92
11
0
01 Mar 2022
Assessing the State of Self-Supervised Human Activity Recognition using
  Wearables
Assessing the State of Self-Supervised Human Activity Recognition using Wearables
H. Haresamudram
Irfan Essa
Thomas Plötz
SSL
95
94
0
22 Feb 2022
RemixIT: Continual self-training of speech enhancement models via
  bootstrapped remixing
RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing
Efthymios Tzinis
Yossi Adi
V. Ithapu
Buye Xu
Paris Smaragdis
Anurag Kumar
CLL
86
55
0
17 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning
  for Speech Pre-Training
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Wenyong Huang
Zhenhe Zhang
Y. Yeung
Xin Jiang
Qun Liu
111
23
0
25 Jan 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech
  Representation Learning for Automatic Speech Recognition
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Qiu-shi Zhu
Jie Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
175
41
0
22 Jan 2022
Self-Supervised Learning for speech recognition with Intermediate layer
  supervision
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
70
29
0
16 Dec 2021
Lacuna Reconstruction: Self-supervised Pre-training for Low-Resource
  Historical Document Transcription
Lacuna Reconstruction: Self-supervised Pre-training for Low-Resource Historical Document Transcription
Nikolai Vogler
J. Allen
M. Miller
Taylor Berg-Kirkpatrick
62
5
0
16 Dec 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation
  on Natural Speech
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Suwon Shon
Ankita Pasad
Felix Wu
Pablo Brusco
Yoav Artzi
Karen Livescu
Kyu Jeong Han
AuLLMELM
106
76
0
19 Nov 2021
Joint Unsupervised and Supervised Training for Multilingual ASR
Joint Unsupervised and Supervised Training for Multilingual ASR
Junwen Bai
Yue Liu
Yu Zhang
Ankur Bapna
Nikhil Siddhartha
K. Sim
Tara N. Sainath
81
59
0
15 Nov 2021
Textless Speech Emotion Conversion using Discrete and Decomposed
  Representations
Textless Speech Emotion Conversion using Discrete and Decomposed Representations
Felix Kreuk
Adam Polyak
Jade Copet
Eugene Kharitonov
Tu Nguyen
M. Rivière
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
Yossi Adi
99
34
0
14 Nov 2021
TUNet: A Block-online Bandwidth Extension Model based on Transformers
  and Self-supervised Pretraining
TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining
Viet-Anh Nguyen
Anh H. T. Nguyen
Andy W. H. Khong
60
22
0
26 Oct 2021
Contrastively Disentangled Sequential Variational Autoencoder
Contrastively Disentangled Sequential Variational Autoencoder
M. Kiener
Weiran Wang
Michael Gerndt
CoGeDRL
110
42
0
22 Oct 2021
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling
  for Self-Supervised Speech Pre-Training
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Yu-An Chung
Yu Zhang
Wei Han
Chung-Cheng Chiu
James Qin
Ruoming Pang
Yonghui Wu
SSLVLM
88
429
0
07 Aug 2021
Analyzing Speaker Information in Self-Supervised Models to Improve
  Zero-Resource Speech Processing
Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing
Benjamin van Niekerk
Leanne Nortje
Matthew Baas
Herman Kamper
SSL
140
32
0
02 Aug 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model
Layer-wise Analysis of a Self-supervised Speech Representation Model
Ankita Pasad
Ju-Chieh Chou
Karen Livescu
SSL
118
308
0
10 Jul 2021
Dropout Regularization for Self-Supervised Learning of Transformer
  Encoder Speech Representation
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Jian Luo
Jianzong Wang
Ning Cheng
Jing Xiao
SSL
69
6
0
09 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and
  Channel Information? A Layer-wise and Neuron-level Analysis
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
Shammur A. Chowdhury
Nadir Durrani
Ahmed M. Ali
108
16
0
01 Jul 2021
Low Resource German ASR with Untranscribed Data Spoken by Non-native
  Children -- INTERSPEECH 2021 Shared Task SPAPL System
Low Resource German ASR with Untranscribed Data Spoken by Non-native Children -- INTERSPEECH 2021 Shared Task SPAPL System
Jinhan Wang
Yunzheng Zhu
Ruchao Fan
Wei Chu
Abeer Alwan
59
8
0
18 Jun 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked
  Prediction of Hidden Units
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
190
3,013
0
14 Jun 2021
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Cheng-I Jeff Lai
Yang Zhang
Alexander H. Liu
Shiyu Chang
Yi-Lun Liao
Yung-Sung Chuang
Kaizhi Qian
Sameer Khurana
David D. Cox
James R. Glass
VLM
162
78
0
10 Jun 2021
Unsupervised Word Segmentation from Discrete Speech Units in
  Low-Resource Settings
Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Marcely Zanon Boito
Bolaji Yusuf
Lucas Ondel
Aline Villavicencio
Laurent Besacier
57
3
0
08 Jun 2021
Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0
  acoustic model
Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model
Apoorv Vyas
S. Madikeri
H. Bourlard
34
15
0
06 Apr 2021
Acoustic word embeddings for zero-resource languages using
  self-supervised contrastive learning and multilingual adaptation
Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptation
C. Jacobs
Yevgen Matusevych
Herman Kamper
66
21
0
19 Mar 2021
Improving speech recognition models with small samples for air traffic
  control systems
Improving speech recognition models with small samples for air traffic control systems
Yi Lin
Qin Li
Bo Yang
Zhen Yan
Huachun Tan
Zhengmao Chen
104
32
0
16 Feb 2021
Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised
  Pre-training and Its Application to Children's ASR
Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASR
Ruchao Fan
Amber Afshan
Abeer Alwan
68
14
0
12 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
290
366
0
01 Feb 2021
On Scaling Contrastive Representations for Low-Resource Speech
  Recognition
On Scaling Contrastive Representations for Low-Resource Speech Recognition
Lasse Borgholt
T. M. S. Tax
Jakob Drachmann Havtorn
Lars Maaløe
Christian Igel
SSL
59
5
0
01 Feb 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
90
67
0
31 Dec 2020
Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic
  Models
Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models
Apoorv Vyas
S. Madikeri
H. Bourlard
26
13
0
28 Dec 2020
Sequence-to-Sequence Contrastive Learning for Text Recognition
Sequence-to-Sequence Contrastive Learning for Text Recognition
Aviad Aberdam
Ron Litman
Shahar Tsiper
Oron Anschel
Ron Slossberg
Shai Mazor
R. Manmatha
Pietro Perona
97
109
0
20 Dec 2020
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector
  Quantization
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization
Shaoshi Ling
Yuzong Liu
75
107
0
11 Dec 2020
Contrastive Predictive Coding for Human Activity Recognition
Contrastive Predictive Coding for Human Activity Recognition
H. Haresamudram
Irfan Essa
Thomas Ploetz
104
122
0
09 Dec 2020
The Zero Resource Speech Benchmark 2021: Metrics and baselines for
  unsupervised spoken language modeling
The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling
Tu Nguyen
Maureen de Seyssel
Patricia Roze
M. Rivière
Evgeny Kharitonov
Alexei Baevski
Ewan Dunbar
Emmanuel Dupoux
SSL
151
108
0
23 Nov 2020
Non-Autoregressive Predictive Coding for Learning Speech Representations
  from Local Dependencies
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
Alexander H. Liu
Yu-An Chung
James R. Glass
SSL
89
88
0
01 Nov 2020
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for
  Self-supervised Speech Representation Learning
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Dongwei Jiang
Wubo Li
Miao Cao
Wei Zou
Xiangang Li
SSL
84
65
0
27 Oct 2020
12
Next