Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.12638
Cited By
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
25 October 2019
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders"
50 / 85 papers shown
Title
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
Li-Wei Chen
Takuya Higuchi
He Bai
Ahmed Hussen Abdelaziz
Alexander Rudnicky
Shinji Watanabe
Tatiana Likhomanenko
B. Theobald
Zakaria Aldeneh
44
0
0
16 Sep 2024
Self-supervised Learning for Acoustic Few-Shot Classification
Jingyong Liang
Bernd Meyer
Issac Ning Lee
Thanh-Toan Do
SSL
52
0
0
15 Sep 2024
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
31
1
0
09 Sep 2024
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
47
0
0
20 Aug 2024
Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge
Rui Liu
Zening Ma
SSL
36
1
0
10 Jun 2024
On the social bias of speech self-supervised models
Yi-Cheng Lin
T. Lin
Hsi-Che Lin
Andy T. Liu
Hung-yi Lee
39
3
0
07 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Saturn Platform: Foundation Model Operations and Generative AI for Financial Services
Antonio Busson
Rennan Gaio
Rafael H. Rocha
Francisco Evangelista
Bruno Rizzi
Luan Carvalho
Rafael Miceli
Marcos Rabaioli
David Favaro
23
1
0
12 Dec 2023
A-JEPA: Joint-Embedding Predictive Architecture Can Listen
Zhengcong Fei
Mingyuan Fan
Junshi Huang
25
17
0
27 Nov 2023
Test-Time Training for Speech
Sri Harsha Dumpala
Chandramouli Shama Sastry
Sageev Oore
39
1
0
19 Sep 2023
MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Yu-Hsiang Wang
Huan Chen
Kai-Wei Chang
Winston H. Hsu
Hung-yi Lee
19
6
0
30 May 2023
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Kangwook Jang
Sungnyun Kim
Se-Young Yun
Hoi-Rim Kim
24
5
0
19 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
22
24
0
17 May 2023
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
23
2
0
12 Apr 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
Phone and speaker spatial organization in self-supervised speech representations
Pablo Riera
M. Cerdeiro
L. Pepino
Luciana Ferrer
SSL
16
1
0
24 Feb 2023
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
25
7
0
16 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
28
21
0
01 Dec 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
29
13
0
17 Nov 2022
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Renée Lu
M. Shahin
Beena Ahmed
27
4
0
14 Nov 2022
Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models
Sathvik Udupa
Siddarth C
P. Ghosh
19
7
0
30 Oct 2022
Audio MFCC-gram Transformers for respiratory insufficiency detection in COVID-19
M. Gauy
Marcelo Finger
16
7
0
25 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
26
33
0
16 Oct 2022
CTCBERT: Advancing Hidden-unit BERT with CTC Objectives
Ruchao Fan
Yiming Wang
Yashesh Gaur
Jinyu Li
33
7
0
16 Oct 2022
Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation
Chendong Zhao
Jianzong Wang
Xiaoyang Qu
Haoqian Wang
Jing Xiao
SSL
30
1
0
15 Oct 2022
JOIST: A Joint Speech and Text Streaming Model For ASR
Tara N. Sainath
Rohit Prabhavalkar
Ankur Bapna
Yu Zhang
Zhouyuan Huo
Zhehuai Chen
Bo-wen Li
Weiran Wang
Trevor Strohman
RALM
AuLLM
38
35
0
13 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen An Li
Hung-yi Lee
Nigel G. Ward
21
47
0
13 Oct 2022
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
13
268
0
13 Jul 2022
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition
Anjana Arunkumar
Vrunda N. Sukhadia
S. Umesh
25
10
0
11 Jun 2022
Joint Encoder-Decoder Self-Supervised Pre-training for ASR
Arunkumar A
S. Umesh
SSL
32
8
0
09 Jun 2022
Contrastive Siamese Network for Semi-supervised Speech Recognition
S. Khorram
Jaeyoung Kim
Anshuman Tripathi
Han Lu
Qian Zhang
Hasim Sak
SSL
8
11
0
27 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
20
8
0
08 May 2022
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
29
65
0
26 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
19
110
0
20 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
34
53
0
15 Apr 2022
Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning
Eesung Kim
J. Jeon
Hyeji Seo
Ho-Young Kim
SSL
21
37
0
08 Apr 2022
Federated Self-supervised Speech Representations: Are We There Yet?
Yan Gao
Javier Fernandez-Marques
Titouan Parcollet
Abhinav Mehrotra
Nicholas D. Lane
22
13
0
06 Apr 2022
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations
L. D. Prasad
Sreyan Ghosh
S. Umesh
19
12
0
31 Mar 2022
Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion
Xintao Zhao
Feng Liu
Changhe Song
Zhiyong Wu
Shiyin Kang
Deyi Tuo
H. Meng
11
20
0
24 Mar 2022
Enhancing Speech Recognition Decoding via Layer Aggregation
Tomer Wullach
Shlomo E. Chazan
24
1
0
21 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
21
19
0
09 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
19
11
0
01 Mar 2022
Assessing the State of Self-Supervised Human Activity Recognition using Wearables
H. Haresamudram
Irfan Essa
Thomas Plötz
SSL
34
85
0
22 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
34
11
0
10 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
25
50
0
02 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Wenyong Huang
Zhenhe Zhang
Y. Yeung
Xin Jiang
Qun Liu
25
23
0
25 Jan 2022
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings
Tiantian Feng
H. Hashemi
Rajat Hebbar
M. Annavaram
Shrikanth S. Narayanan
13
25
0
26 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
16
28
0
16 Dec 2021
1
2
Next