Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2001.09239
Cited By
v1
v2 (latest)
Multi-task self-supervised learning for Robust Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
25 January 2020
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multi-task self-supervised learning for Robust Speech Recognition"
50 / 167 papers shown
Noisy Disentanglement with Tri-stage Training for Noise-Robust Speech Recognition
Shuangyuan Chen
Shuang Wei
Dongxing Xu
Yanhua Long
197
0
0
01 Sep 2025
Model Unmerging: Making Your Models Unmergeable for Secure Model Sharing
Zihao Wang
Enneng Yang
L. Yin
Shiwei Liu
Li Shen
FedML
MoMe
197
1
0
01 Sep 2025
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models
Pattern Recognition (Pattern Recogn.), 2025
Jing-Xuan Zhang
Genshun Wan
Jianqing Gao
Zhen-Hua Ling
349
13
0
09 Feb 2025
LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Soumya Dutta
Sriram Ganapathy
372
21
0
20 Jan 2025
SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery
Enneng Yang
Li Shen
Zhenyi Wang
G. Guo
Xingwei Wang
Xiaocun Cao
Jie Zhang
Dacheng Tao
MoMe
280
11
0
18 Oct 2024
Audio Explanation Synthesis with Generative Foundation Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Alican Akman
Qiyang Sun
Björn W. Schuller
299
2
0
10 Oct 2024
A Joint Spectro-Temporal Relational Thinking Based Acoustic Modeling Framework
Zheng Nan
T. Dang
V. Sethu
Beena Ahmed
180
0
0
17 Sep 2024
Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
Duc-Tuan Truong
Ruijie Tao
Tuan Nguyen
Hieu-Thi Luong
Kong Aik Lee
Eng Siong Chng
301
44
0
25 Jun 2024
mHuBERT-147: A Compact Multilingual HuBERT Model
Marcely Zanon Boito
Vivek Iyer
Nikolaos Lagos
Laurent Besacier
Ioan Calapodescu
VLM
574
70
0
10 Jun 2024
A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability
Li-Yang Tseng
Tzu-Ling Lin
Hong-Han Shuai
Jen-Wei Huang
Wen-Whei Chang
191
1
0
21 May 2024
LLAniMAtion: LLAMA Driven Gesture Animation
John T. Windle
Iain Matthews
Sarah Taylor
290
1
0
13 May 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
319
62
0
15 Apr 2024
BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
A. Haliassos
Andreas Zinonos
Rodrigo Mira
Stavros Petridis
Maja Pantic
VLM
SSL
AI4TS
339
26
0
02 Apr 2024
SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning
Luca Zampierin
G. B. Hacene
Bac Nguyen
Mirco Ravanelli
325
4
0
26 Feb 2024
AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies
José-M. Acosta-Triana
David Gimeno-Gómez
Carlos David Martínez Hinarejos
VLM
VGen
328
4
0
20 Feb 2024
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
Calum Heggan
S. Budgett
Timothy M. Hospedales
Mehrdad Yaghoobi
SSL
356
3
0
02 Feb 2024
Reading Between the Frames: Multi-Modal Depression Detection in Videos from Non-Verbal Cues
David Gimeno-Gómez
Ana-Maria Bucur
Adrian Cosma
Carlos David Martínez Hinarejos
Paolo Rosso
259
28
0
05 Jan 2024
FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition
Automatic Speech Recognition & Understanding (ASRU), 2023
Dongning Yang
Wei Wang
Yanmin Qian
353
7
0
29 Nov 2023
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors
International Conference on Natural Language and Speech Processing (ICNLSP), 2023
Shuyue Stella Li
Beining Xu
Xiangyu Zhang
Hexin Liu
Wen-Han Chao
Leibny Paola García
SSL
222
5
0
27 Nov 2023
Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Hsin-Tien Chiang
Szu-Wei Fu
Hsin-Min Wang
Yu Tsao
John H. L. Hansen
275
8
0
15 Nov 2023
Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Automatic Speaker Verification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Duc-Tuan Truong
Ruijie Tao
J. Yip
Kong Aik Lee
Chng Eng Siong
267
13
0
26 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Computer Speech and Language (CSL), 2023
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
Fabien Ringeval
D. Schwab
Laurent Besacier
301
31
0
11 Sep 2023
The Quest of Finding the Antidote to Sparse Double Descent
Victor Quétu
Marta Milovanović
353
0
0
31 Aug 2023
Rep2wav: Noise Robust text-to-speech Using self-supervised representations
Qiu-shi Zhu
Yunting Gu
Rilin Chen
Chao Weng
Yuchen Hu
Lirong Dai
Jie Zhang
AI4TS
264
3
0
28 Aug 2023
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads
Computer Speech and Language (CSL), 2023
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
S. Essid
Mirco Ravanelli
SSL
277
20
0
28 Aug 2023
Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping
IEEE International Conference on Computer Vision (ICCV), 2023
Y. A. D. Djilali
Sanath Narayan
Haithem Boussaid
Ebtesam Almazrouei
Merouane Debbah
245
16
0
11 Aug 2023
Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Varun Krishna
T. Sai
Sriram Ganapathy
SSL
189
3
0
14 Jul 2023
On the Effectiveness of Speech Self-supervised Learning for Music
International Society for Music Information Retrieval Conference (ISMIR), 2023
Yi Ma
Ruibin Yuan
Yi Zhou
Ge Zhang
Xingran Chen
...
Ruibo Liu
Gus Xia
Roger Dannenberg
Yi-Ting Guo
Jie Fu
194
14
0
11 Jul 2023
Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Interspeech (Interspeech), 2023
Jiajun Deng
Guinan Li
Xurong Xie
Zengrui Jin
Mingyu Cui
Tianzi Wang
Shujie Hu
Mengzhe Geng
Xunying Liu
BDL
253
2
0
26 Jun 2023
Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement
Interspeech (Interspeech), 2023
Hejung Yang
Hong-Goo Kang
SSL
221
1
0
14 Jun 2023
Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations
Interspeech (Interspeech), 2023
Salah Zaiem
Titouan Parcollet
S. Essid
225
2
0
01 Jun 2023
How to Estimate Model Transferability of Pre-Trained Speech Models?
Interspeech (Interspeech), 2023
Zih-Ching Chen
Chao-Han Huck Yang
Yue Liu
Yu Zhang
Nanxin Chen
Shoufeng Chang
Rohit Prabhavalkar
Hung-yi Lee
Tara N. Sainath
503
11
0
01 Jun 2023
MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations
Interspeech (Interspeech), 2023
Calum Heggan
Timothy M. Hospedales
S. Budgett
Mehrdad Yaghoobi
SSL
383
7
0
29 May 2023
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition
Interspeech (Interspeech), 2023
Wangyou Zhang
Y. Qian
286
12
0
25 May 2023
On the Efficacy and Noise-Robustness of Jointly Learned Speech Emotion and Automatic Speech Recognition
Interspeech (Interspeech), 2023
L. Bansal
S. P. Dubagunta
Malolan Chetlur
Pushpak Jagtap
A. Ganapathiraju
268
1
0
21 May 2023
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
International Conference on Machine Learning (ICML), 2023
Wei-wei Lin
Chenhang He
Man-Wai Mak
Youzhi Tu
221
6
0
14 May 2023
Continual Learning of Hand Gestures for Human-Robot Interaction
Xavier Cucurull
A. Garrell
175
3
0
13 Apr 2023
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2023
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
447
3
0
12 Apr 2023
Nonlinear Independent Component Analysis for Principled Disentanglement in Unsupervised Deep Learning
Patterns (Patterns), 2023
Aapo Hyvarinen
Ilyes Khemakhem
H. Morioka
CML
OOD
372
63
0
29 Mar 2023
Evaluating gesture generation in a large-scale open challenge: The GENEA Challenge 2022
ACM Transactions on Graphics (TOG), 2023
Taras Kucherenko
Pieter Wolfert
Youngwoo Yoon
Carla Viegas
Teodor Nikolov
Mihail Tsakov
G. Henter
219
36
0
15 Mar 2023
Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study
Salah Zaiem
Robin Algayres
Titouan Parcollet
S. Essid
Mirco Ravanelli
325
20
0
12 Mar 2023
Multi-Task Self-Supervised Time-Series Representation Learning
Information Sciences (Inf. Sci.), 2023
Heejeong Choi
Pilsung Kang
AI4TS
SSL
311
21
0
02 Mar 2023
Can we avoid Double Descent in Deep Neural Networks?
International Conference on Information Photonics (ICIP), 2023
Victor Quétu
Enzo Tartaglione
AI4CE
340
2
0
26 Feb 2023
Jointly Learning Visual and Auditory Speech Representations from Raw Data
International Conference on Learning Representations (ICLR), 2022
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
Maja Pantic
SSL
331
73
0
12 Dec 2022
An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
Spandan Dey
Md. Sahidullah
G. Saha
230
33
0
30 Nov 2022
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Interspeech (Interspeech), 2022
Ziyang Ma
Zhisheng Zheng
Changli Tang
Yujin Wang
Xie Chen
342
21
0
14 Nov 2022
Biased Self-supervised learning for ASR
Interspeech (Interspeech), 2022
Florian Kreyssig
Yangyang Shi
Jinxi Guo
Leda Sari
Abdel-rahman Mohamed
P. Woodland
SSL
216
4
0
04 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Neural Information Processing Systems (NeurIPS), 2022
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
460
10
0
02 Nov 2022
Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Sathvik Udupa
Siddarth C
P. Ghosh
231
11
0
30 Oct 2022
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qiu-shi Zhu
Long Zhou
Jie Zhang
Shujie Liu
Yu-Chen Hu
Lirong Dai
VLM
SSL
214
44
0
27 Oct 2022
1
2
3
4
Next
Page 1 of 4