ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.03416
  4. Cited By
Learning Problem-agnostic Speech Representations from Multiple
  Self-supervised Tasks

Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

6 April 2019
Santiago Pascual
Mirco Ravanelli
Joan Serrà
Antonio Bonafonte
Yoshua Bengio
    SSL
ArXiv (abs)PDFHTML

Papers citing "Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks"

47 / 147 papers shown
The effectiveness of unsupervised subword modeling with autoregressive
  and cross-lingual phone-aware networks
The effectiveness of unsupervised subword modeling with autoregressive and cross-lingual phone-aware networksIEEE Open Journal of Signal Processing (JOSP), 2020
Siyuan Feng
O. Scharenborg
SSL
206
3
0
17 Dec 2020
Self-Supervised Time Series Representation Learning by Inter-Intra
  Relational Reasoning
Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning
Haoyi Fan
Fengbin Zhang
Yue Gao
AI4TS
172
20
0
27 Nov 2020
Towards Semi-Supervised Semantics Understanding from Speech
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
184
7
0
11 Nov 2020
Non-Autoregressive Predictive Coding for Learning Speech Representations
  from Local Dependencies
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local DependenciesInterspeech (Interspeech), 2020
Alexander H. Liu
Yu-An Chung
James R. Glass
SSL
207
93
0
01 Nov 2020
Interpretable Representation Learning for Speech and Audio Signals Based
  on Relevance Weighting
Interpretable Representation Learning for Speech and Audio Signals Based on Relevance WeightingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Purvi Agrawal
Sriram Ganapathy
147
21
0
29 Oct 2020
Robust Raw Waveform Speech Recognition Using Relevance Weighted
  Representations
Robust Raw Waveform Speech Recognition Using Relevance Weighted RepresentationsInterspeech (Interspeech), 2020
Purvi Agrawal
Sriram Ganapathy
106
2
0
29 Oct 2020
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for
  Self-supervised Speech Representation Learning
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation LearningInterspeech (Interspeech), 2020
Dongwei Jiang
Wubo Li
Miao Cao
Wei Zou
Xiangang Li
SSL
296
73
0
27 Oct 2020
Probing Acoustic Representations for Phonetic Properties
Probing Acoustic Representations for Phonetic PropertiesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Danni Ma
Neville Ryant
M. Liberman
337
52
0
25 Oct 2020
Similarity Analysis of Self-Supervised Speech Representations
Similarity Analysis of Self-Supervised Speech Representations
Yu-An Chung
Yonatan Belinkov
James R. Glass
SSL
340
44
0
22 Oct 2020
Contrastive Learning of General-Purpose Audio Representations
Contrastive Learning of General-Purpose Audio Representations
Aaqib Saeed
David Grangier
Neil Zeghidour
VLMSSL
253
311
0
21 Oct 2020
FastVC: Fast Voice Conversion with non-parallel data
FastVC: Fast Voice Conversion with non-parallel data
Oriol Barbany
Milos Cernak
130
7
0
08 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding
  Predictive Components
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
Junwen Bai
Weiran Wang
Yingbo Zhou
Caiming Xiong
SSLAI4TS
195
12
0
07 Oct 2020
SESQA: semi-supervised learning for speech quality assessment
SESQA: semi-supervised learning for speech quality assessmentIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Joan Serrà
Jordi Pons
Santiago Pascual
231
48
0
01 Oct 2020
Detecting Parkinson's Disease From an Online Speech-task
Detecting Parkinson's Disease From an Online Speech-task
Wasifur Rahman
Sangwu Lee
Md. Saiful Islam
Victor Nikhil Antony
Harshil Ratnu
...
Ellen Wagner
Stella Jensen-Roberts
M. R. Ali
Ray Dorsey
E. Hoque
141
0
0
02 Sep 2020
Multi-Task Learning for Interpretable Weakly Labelled Sound Event
  Detection
Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection
Soham Deshmukh
Bhiksha Raj
Rita Singh
119
8
0
17 Aug 2020
Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve
  Multimodal Speech Emotion Recognition
Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Shamane Siriwardhana
Andrew Reis
Rivindu Weerasekera
Suranga Nanayakkara
201
116
0
15 Aug 2020
Adaptation Algorithms for Neural Network-Based Speech Recognition: An
  Overview
Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview
P. Bell
Joachim Fainberg
Ondˇrej Klejch
Jinyu Li
Steve Renals
P. Swietojanski
316
82
0
14 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice ConversionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
320
46
0
07 Aug 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for
  Speech
TERA: Self-Supervised Learning of Transformer Encoder Representation for SpeechIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Andy T. Liu
Shang-Wen Li
Hung-yi Lee
SSL
532
393
0
12 Jul 2020
Learning Speech Representations from Raw Audio by Joint Audiovisual
  Self-Supervision
Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision
Abhinav Shukla
Stavros Petridis
Maja Pantic
SSL
141
17
0
08 Jul 2020
Self-supervised Learning for Speech Enhancement
Self-supervised Learning for Speech Enhancement
Yuchun Wang
Shrikant Venkataramani
Paris Smaragdis
SSL
160
34
0
18 Jun 2020
Input-independent Attention Weights Are Expressive Enough: A Study of
  Attention in Self-supervised Audio Transformers
Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers
Tsung-Han Wu
Chun-Chen Hsieh
Yen-Hao Chen
Po-Han Chi
Hung-yi Lee
221
1
0
09 Jun 2020
Self-Supervised Dynamic Networks for Covariate Shift Robustness
Self-Supervised Dynamic Networks for Covariate Shift Robustness
Tomer Cohen
Noy Shulman
Hai Morgenstern
Roey Mechrez
Erez Farhan
OOD
182
4
0
06 Jun 2020
CSTNet: Contrastive Speech Translation Network for Self-Supervised
  Speech Representation Learning
CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning
Sameer Khurana
Antoine Laurent
James R. Glass
SSL
180
12
0
04 Jun 2020
A Convolutional Deep Markov Model for Unsupervised Speech Representation
  Learning
A Convolutional Deep Markov Model for Unsupervised Speech Representation LearningInterspeech (Interspeech), 2020
Sameer Khurana
Antoine Laurent
Wei-Ning Hsu
J. Chorowski
A. Lancucki
R. Marxer
James R. Glass
SSLBDL
175
29
0
03 Jun 2020
Exploring the Best Loss Function for DNN-Based Low-latency Speech
  Enhancement with Temporal Convolutional Networks
Exploring the Best Loss Function for DNN-Based Low-latency Speech Enhancement with Temporal Convolutional Networks
Yuichiro Koyama
Tyler Vuong
Stefan Uhlich
Bhiksha Raj
246
45
0
23 May 2020
A Further Study of Unsupervised Pre-training for Transformer Based
  Speech Recognition
A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Dongwei Jiang
Wubo Li
Ruixiong Zhang
Miao Cao
Ne Luo
Yang Han
Wei Zou
Xiangang Li
SSL
208
31
0
20 May 2020
Vector-Quantized Autoregressive Predictive Coding
Vector-Quantized Autoregressive Predictive Coding
Yu-An Chung
Hao Tang
James R. Glass
SSL
186
124
0
17 May 2020
Does Visual Self-Supervision Improve Learning of Speech Representations
  for Emotion Recognition?
Does Visual Self-Supervision Improve Learning of Speech Representations for Emotion Recognition?IEEE Transactions on Affective Computing (IEEE TAC), 2020
Abhinav Shukla
Stavros Petridis
Maja Pantic
SSL
417
33
0
04 May 2020
An Early Study on Intelligent Analysis of Speech under COVID-19:
  Severity, Sleep Quality, Fatigue, and Anxiety
An Early Study on Intelligent Analysis of Speech under COVID-19: Severity, Sleep Quality, Fatigue, and AnxietyInterspeech (Interspeech), 2020
Jing Han
Kun Qian
Meishu Song
Zijiang Yang
Zhao Ren
...
Tomoya Koike
Xiao Li
Zixing Zhang
Yoshiharu Yamamoto
Björn W. Schuller
259
102
0
30 Apr 2020
From Inference to Generation: End-to-end Fully Self-supervised
  Generation of Human Face from Speech
From Inference to Generation: End-to-end Fully Self-supervised Generation of Human Face from SpeechInternational Conference on Learning Representations (ICLR), 2020
Hyeong-Seok Choi
Changdae Park
Kyogu Lee
CVBM
128
32
0
13 Apr 2020
Improved Speech Representations with Multi-Target Autoregressive
  Predictive Coding
Improved Speech Representations with Multi-Target Autoregressive Predictive CodingAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Yu-An Chung
James R. Glass
SSL
206
57
0
11 Apr 2020
A Comparison of Metric Learning Loss Functions for End-To-End Speaker
  Verification
A Comparison of Metric Learning Loss Functions for End-To-End Speaker VerificationInternational Conference on Statistical Language and Speech Processing (ICSLSP), 2020
Juan Manuel Coria
H. Bredin
Sahar Ghannay
S. Rosset
172
16
0
31 Mar 2020
Deep Neural Networks for Automatic Speech Processing: A Survey from
  Large Corpora to Limited Data
Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited DataEURASIP Journal on Audio, Speech, and Music Processing (JEASMP), 2020
Vincent Roger
Jérôme Farinas
J. Pinquier
115
31
0
09 Mar 2020
Towards Learning a Universal Non-Semantic Representation of Speech
Towards Learning a Universal Non-Semantic Representation of SpeechInterspeech (Interspeech), 2020
Joel Shor
A. Jansen
Ronnie Maor
Oran Lang
Omry Tuval
Félix de Chaumont Quitry
Marco Tagliasacchi
Ira Shavitt
Dotan Emanuel
Yinnon A. Haviv
SSL
556
166
0
25 Feb 2020
Limitations of weak labels for embedding and tagging
Limitations of weak labels for embedding and taggingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Nicolas Turpault
Romain Serizel
Emmanuel Vincent
317
10
0
05 Feb 2020
Unsupervised Pre-training of Bidirectional Speech Encoders via Masked
  Reconstruction
Unsupervised Pre-training of Bidirectional Speech Encoders via Masked ReconstructionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Weiran Wang
Qingming Tang
Karen Livescu
SSL
229
99
0
28 Jan 2020
Multi-task self-supervised learning for Robust Speech Recognition
Multi-task self-supervised learning for Robust Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
467
303
0
25 Jan 2020
Visually Guided Self Supervised Learning of Speech Representations
Visually Guided Self Supervised Learning of Speech RepresentationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Abhinav Shukla
Konstantinos Vougioukas
Pingchuan Ma
Stavros Petridis
Maja Pantic
SSL
166
30
0
13 Jan 2020
Robust Estimation of Hypernasality in Dysarthria with Acoustic Model
  Likelihood Features
Robust Estimation of Hypernasality in Dysarthria with Acoustic Model Likelihood FeaturesIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2019
Michael Stephen Saxon
Ayush Tripathi
Yishan Jiao
J. Liss
Visar Berisha
213
16
0
26 Nov 2019
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded
  Speech
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded SpeechInternational Conference on Learning Representations (ICLR), 2019
David Harwath
Wei-Ning Hsu
James R. Glass
170
88
0
21 Nov 2019
Speaker-invariant Affective Representation Learning via Adversarial
  Training
Speaker-invariant Affective Representation Learning via Adversarial TrainingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Haoqi Li
Ming Tu
Jing-ling Huang
Shrikanth Narayanan
P. Georgiou
346
60
0
04 Nov 2019
Learning audio representations via phase prediction
Learning audio representations via phase prediction
Félix de Chaumont Quitry
Marco Tagliasacchi
Dominik Roblek
SSLAI4TS
105
10
0
25 Oct 2019
Generative Pre-Training for Speech with Autoregressive Predictive Coding
Generative Pre-Training for Speech with Autoregressive Predictive CodingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Yu-An Chung
James R. Glass
SSL
330
182
0
23 Oct 2019
Improving Transformer-based Speech Recognition Using Unsupervised
  Pre-training
Improving Transformer-based Speech Recognition Using Unsupervised Pre-training
Dongwei Jiang
Xiaoning Lei
Wubo Li
Ne Luo
Yuxuan Hu
Wei Zou
Xiangang Li
263
105
0
22 Oct 2019
Problem-Agnostic Speech Embeddings for Multi-Speaker Text-to-Speech with
  SampleRNN
Problem-Agnostic Speech Embeddings for Multi-Speaker Text-to-Speech with SampleRNNSpeech Synthesis Workshop (SSW), 2019
David Álvarez
Santiago Pascual
Antonio Bonafonte
177
12
0
03 Jun 2019
Self-supervised audio representation learning for mobile devices
Self-supervised audio representation learning for mobile devices
Marco Tagliasacchi
Beat Gfeller
Félix de Chaumont Quitry
Dominik Roblek
SSLAI4TS
157
47
0
24 May 2019
Previous
123