ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.01205
  4. Cited By
Audio Self-supervised Learning: A Survey

Audio Self-supervised Learning: A Survey

2 March 2022
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
    SSL
ArXivPDFHTML

Papers citing "Audio Self-supervised Learning: A Survey"

28 / 78 papers shown
Title
Improved acoustic-to-articulatory inversion using representations from
  pretrained self-supervised learning models
Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models
Sathvik Udupa
Siddarth C
P. Ghosh
11
7
0
30 Oct 2022
Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster
  Fine-tuning with Less Labels in Speech Processing
Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing
Haomiao Yang
Jinming Zhao
Gholamreza Haffari
Ehsan Shareghi
17
2
0
24 Oct 2022
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Jonah Anton
H. Coppock
Pancham Shukla
Bjorn W. Schuller
BDL
SSL
19
8
0
28 Sep 2022
Self-Relation Attention and Temporal Awareness for Emotion Recognition
  via Vocal Burst
Self-Relation Attention and Temporal Awareness for Emotion Recognition via Vocal Burst
Dang-Linh Trinh
Minh-Cong Vo
Gueesang Lee
14
2
0
15 Sep 2022
Redundancy Reduction Twins Network: A Training framework for
  Multi-output Emotion Regression
Redundancy Reduction Twins Network: A Training framework for Multi-output Emotion Regression
Xin Jing
Meishu Song
Andreas Triantafyllopoulos
Zijiang Yang
Björn W. Schuller
11
8
0
18 Jun 2022
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
Ryandhimas E. Zezario
Szu-Wei Fu
Fei Chen
C. Fuh
Hsin-Min Wang
Yu Tsao
11
13
0
07 Apr 2022
Probing Speech Emotion Recognition Transformers for Linguistic Knowledge
Probing Speech Emotion Recognition Transformers for Linguistic Knowledge
Andreas Triantafyllopoulos
Johannes Wagner
H. Wierstorf
Maximilian Schmitt
U. Reichel
F. Eyben
Felix Burkhardt
Björn W. Schuller
11
25
0
01 Apr 2022
A Temporal-oriented Broadcast ResNet for COVID-19 Detection
A Temporal-oriented Broadcast ResNet for COVID-19 Detection
Xin Jing
Shuo Liu
Emilia Parada-Cabaleiro
Andreas Triantafyllopoulos
Meishu Song
Zijiang Yang
Björn W. Schuller
33
2
0
31 Mar 2022
Learning Audio Representations with MLPs
Learning Audio Representations with MLPs
Mashrur M. Morshed
Ahmad Omar Ahsan
H. Mahmud
Md. Kamrul Hasan
8
4
0
16 Mar 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
InQSS: a speech intelligibility and quality assessment model using a
  multi-task learning network
InQSS: a speech intelligibility and quality assessment model using a multi-task learning network
Yu-Wen Chen
Yu Tsao
12
12
0
04 Nov 2021
SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition
SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition
Li Fu
Xiaoxiao Li
Runyu Wang
Lu Fan
Zhengchen Zhang
Meng Chen
Youzheng Wu
Xiaodong He
SSL
6
3
0
08 Oct 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
231
573
0
22 Apr 2021
Understanding self-supervised Learning Dynamics without Contrastive
  Pairs
Understanding self-supervised Learning Dynamics without Contrastive Pairs
Yuandong Tian
Xinlei Chen
Surya Ganguli
SSL
132
278
0
12 Feb 2021
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual
  Video Representation Learning
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning
Sangho Lee
Jiwan Chung
Youngjae Yu
Gunhee Kim
Thomas Breuel
Gal Chechik
Yale Song
71
45
0
26 Jan 2021
BYOL works even without batch statistics
BYOL works even without batch statistics
Pierre Harvey Richemond
Jean-Bastien Grill
Florent Altché
Corentin Tallec
Florian Strub
...
Samuel L. Smith
Soham De
Razvan Pascanu
Bilal Piot
Michal Valko
SSL
242
114
0
20 Oct 2020
CLAR: Contrastive Learning of Auditory Representations
CLAR: Contrastive Learning of Auditory Representations
Haider Al-Tahan
Y. Mohsenzadeh
SSL
108
55
0
19 Oct 2020
For self-supervised learning, Rationality implies generalization,
  provably
For self-supervised learning, Rationality implies generalization, provably
Yamini Bansal
Gal Kaplun
Boaz Barak
OOD
SSL
50
22
0
16 Oct 2020
Contrastive Representation Learning: A Framework and Review
Contrastive Representation Learning: A Framework and Review
Phúc H. Lê Khắc
Graham Healy
A. Smeaton
SSL
AI4TS
146
670
0
10 Oct 2020
Self-supervised Neural Audio-Visual Sound Source Localization via
  Probabilistic Spatial Modeling
Self-supervised Neural Audio-Visual Sound Source Localization via Probabilistic Spatial Modeling
Yoshiki Masuyama
Yoshiaki Bando
Kohei Yatabe
Y. Sasaki
Masaki Onishi
Yasuhiro Oikawa
SSL
42
13
0
28 Jul 2020
Does Visual Self-Supervision Improve Learning of Speech Representations
  for Emotion Recognition?
Does Visual Self-Supervision Improve Learning of Speech Representations for Emotion Recognition?
Abhinav Shukla
Stavros Petridis
M. Pantic
SSL
27
28
0
04 May 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
229
1,281
0
18 Mar 2020
Improved Baselines with Momentum Contrastive Learning
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
235
3,029
0
09 Mar 2020
Self-supervised learning for audio-visual speaker diarization
Self-supervised learning for audio-visual speaker diarization
Yifan Ding
Yong-mei Xu
Shi-Xiong Zhang
Yahuan Cong
Liqiang Wang
VLM
34
29
0
13 Feb 2020
Multi-task self-supervised learning for Robust Speech Recognition
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
171
288
0
25 Jan 2020
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
214
1,954
0
14 Jun 2018
Boosting Self-Supervised Learning via Knowledge Transfer
Boosting Self-Supervised Learning via Knowledge Transfer
M. Noroozi
Ananth Vinjimoor
Paolo Favaro
Hamed Pirsiavash
SSL
207
282
0
01 May 2018
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
29,632
0
16 Jan 2013
Previous
12