Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1904.03416
Cited By
Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks
6 April 2019
Santiago Pascual
Mirco Ravanelli
Joan Serrà
Antonio Bonafonte
Yoshua Bengio
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks"
50 / 147 papers shown
Title
COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Ruben Ciranni
Emilian Postolache
Giorgio Mariani
Michele Mancusi
Giorgio Fabbro
Emanuele Rodolà
Luca Cosmo
634
16
0
10 Jan 2025
AfriHuBERT: A self-supervised speech representation model for African languages
Jesujoba Oluwadara Alabi
Xuechen Liu
Dietrich Klakow
Junichi Yamagishi
VLM
427
11
0
30 Sep 2024
Towards the Next Frontier in Speech Representation Learning Using Disentanglement
Varun Krishna
Sriram Ganapathy
SSL
258
2
0
02 Jul 2024
mHuBERT-147: A Compact Multilingual HuBERT Model
Marcely Zanon Boito
Vivek Iyer
Nikolaos Lagos
Laurent Besacier
Ioan Calapodescu
VLM
422
57
0
10 Jun 2024
A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability
Li-Yang Tseng
Tzu-Ling Lin
Hong-Han Shuai
Jen-Wei Huang
Wen-Whei Chang
133
1
0
21 May 2024
Benchmarking Representations for Speech, Music, and Acoustic Events
Moreno La Quatra
Alkis Koudounas
Lorenzo Vaiani
Elena Baralis
Luca Cagliero
Paolo Garza
Sabato Marco Siniscalchi
163
28
0
02 May 2024
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory
Zicheng Liu
Li Wang
Siyuan Li
Zedong Wang
Haitao Lin
Stan Z. Li
VLM
227
5
0
17 Apr 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
222
2
0
28 Mar 2024
Exploring the Task-agnostic Trait of Self-supervised Learning in the Context of Detecting Mental Disorders
Rohan kumar Gupta
Rohit Sinha
237
0
0
22 Mar 2024
Reading Between the Frames: Multi-Modal Depression Detection in Videos from Non-Verbal Cues
David Gimeno-Gómez
Ana-Maria Bucur
Adrian Cosma
Carlos David Martínez Hinarejos
Paolo Rosso
207
24
0
05 Jan 2024
CORN: Co-Trained Full- And No-Reference Speech Quality Assessment
Pranay Manocha
Donald Williamson
Adam Finkelstein
251
3
0
13 Oct 2023
Self-Supervised Learning for Audio-Based Emotion Recognition
Peranut Nimitsurachat
Peter Washington
194
3
0
23 Jul 2023
Large-scale unsupervised audio pre-training for video-to-speech synthesis
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGen
215
5
0
27 Jun 2023
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Xian Li
Nian Shao
Xiaofei Li
ViT
CLIP
348
44
0
07 Jun 2023
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Sameer Khurana
Nauman Dawalatabad
Antoine Laurent
Luis Vicente
Pablo Gimeno
Victoria Mingote
James R. Glass
VLM
335
1
0
01 Jun 2023
MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Automatic Speech Recognition & Understanding (ASRU), 2023
Yu-Hsiang Wang
Huan Chen
Kai-Wei Chang
Winston H. Hsu
Hung-yi Lee
418
7
0
30 May 2023
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
International Conference on Machine Learning (ICML), 2023
Wei-wei Lin
Chenhang He
Man-Wai Mak
Youzhi Tu
165
6
0
14 May 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
303
199
0
21 Mar 2023
Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022
Genshun Wan
Tan Liu
Hang Chen
Jia Pan
Cong Liu
Z. Ye
SSL
137
0
0
07 Dec 2022
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Interspeech (Interspeech), 2022
Ziyang Ma
Zhisheng Zheng
Changli Tang
Yujin Wang
Xie Chen
302
21
0
14 Nov 2022
Biased Self-supervised learning for ASR
Interspeech (Interspeech), 2022
Florian Kreyssig
Yangyang Shi
Jinxi Guo
Leda Sari
Abdel-rahman Mohamed
P. Woodland
SSL
147
4
0
04 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Neural Information Processing Systems (NeurIPS), 2022
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
332
10
0
02 Nov 2022
Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Akanksha Saran
K. Desai
M. L. Chang
Rudolf Lioutikov
A. Thomaz
S. Niekum
138
4
0
01 Nov 2022
On the Role of Visual Context in Enriching Music Representations
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Kleanthis Avramidis
Shanti Stewart
Shrikanth Narayanan
155
4
0
28 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Spoken Language Technology Workshop (SLT), 2022
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
236
38
0
16 Oct 2022
Individualized Conditioning and Negative Distances for Speaker Separation
International Conference on Machine Learning and Applications (ICMLA), 2022
Tao Sun
Nidal Abuhajar
Shuyu Gong
Zhewei Wang
Charles D. Smith
Xianhui Wang
Li Xu
Jundong Liu
VLM
151
1
0
12 Oct 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations
Tung-Yu Wu
Chen-An Li
Tzu-Han Lin
Tsung-Yuan Hsu
Hung-yi Lee
231
6
0
26 Sep 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
International Society for Music Information Retrieval Conference (ISMIR), 2022
Longshen Ou
Xiangming Gu
Ye Wang
165
24
0
20 Jul 2022
Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models
Interspeech (Interspeech), 2022
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
163
34
0
14 Jul 2022
Towards Proper Contrastive Self-supervised Learning Strategies For Music Audio Representation
IEEE International Conference on Multimedia and Expo (ICME), 2022
Jeong-Eun Choi
Seongwon Jang
Hyunsouk Cho
Sehee Chung
SSL
151
11
0
10 Jul 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Interspeech (Interspeech), 2022
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Jane Polak Scowcroft
DiffM
178
14
0
05 Jul 2022
Wav2Vec-Aug: Improved self-supervised training with limited data
Interspeech (Interspeech), 2022
Anuroop Sriram
Michael Auli
Alexei Baevski
SSL
VLM
169
16
0
27 Jun 2022
Interpretable Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection
Interspeech (Interspeech), 2022
Debottam Dutta
Debarpan Bhattacharya
Sriram Ganapathy
A. H. Poorjam
Deepak Mittal
M. Singh
87
1
0
27 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Hanjing Zhu
Gaofeng Cheng
Yongfeng Zhang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
342
22
0
20 Jun 2022
Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data
Shohreh Deldari
Hao Xue
Aaqib Saeed
Jiayuan He
Daniel V. Smith
Flora D. Salim
AI4TS
227
43
0
06 Jun 2022
A Multimodal Corpus for Emotion Recognition in Sarcasm
International Conference on Language Resources and Evaluation (LREC), 2022
Anupama Ray
Shubham Mishra
Apoorva Nunna
P. Bhattacharyya
143
60
0
05 Jun 2022
Self-Supervised Speech Representation Learning: A Review
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
634
441
0
21 May 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Sameer Khurana
Antoine Laurent
James R. Glass
157
43
0
17 May 2022
Sound Localization by Self-Supervised Time Delay Estimation
European Conference on Computer Vision (ECCV), 2022
Ziyang Chen
David Fouhey
Andrew Owens
SSL
239
23
0
26 Apr 2022
Cross-stitched Multi-modal Encoders
Karan Singla
Daniel Pressel
Ryan Price
Bhargav Srinivas Chinnari
Yeon-Jun Kim
S. Bangalore
147
0
0
20 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
International Conference on Machine Learning (ICML), 2022
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
181
141
0
20 Apr 2022
Self Supervised Adversarial Domain Adaptation for Cross-Corpus and Cross-Language Speech Emotion Recognition
IEEE Transactions on Affective Computing (IEEE TAC), 2022
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Björn Schuller
184
64
0
19 Apr 2022
On the pragmatism of using binary classifiers over data intensive neural network classifiers for detection of COVID-19 from voice
Ankit Parag Shah
Hira Dhamyal
Yang Gao
Daniel Arancibia
Mario Arancibia
Bhiksha Raj
Rita Singh
156
6
0
11 Apr 2022
Federated Self-supervised Speech Representations: Are We There Yet?
Interspeech (Interspeech), 2022
Yan Gao
Javier Fernandez-Marques
Titouan Parcollet
Abhinav Mehrotra
Nicholas D. Lane
151
14
0
06 Apr 2022
Federated Self-Supervised Learning for Acoustic Event Classification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Meng Feng
Chieh-Chi Kao
Qingming Tang
Ming Sun
Viktor Rozgic
Spyros Matsoukas
Chao Wang
158
14
0
22 Mar 2022
HEAR: Holistic Evaluation of Audio Representations
Neural Information Processing Systems (NeurIPS), 2022
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
375
133
0
06 Mar 2022
Audio Self-supervised Learning: A Survey
Patterns (Patterns), 2022
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
234
128
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
227
13
0
01 Mar 2022
Visual Speech Recognition for Multiple Languages in the Wild
Nature Machine Intelligence (Nat. Mach. Intell.), 2022
Pingchuan Ma
Stavros Petridis
Maja Pantic
VLM
389
191
0
26 Feb 2022
CALM: Contrastive Aligned Audio-Language Multirate and Multimodal Representations
Vin Sachidananda
Shao-Yen Tseng
Erik Marchi
S. Kajarekar
P. Georgiou
134
9
0
08 Feb 2022
1
2
3
Next