Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.01205
Cited By
Audio Self-supervised Learning: A Survey
2 March 2022
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Audio Self-supervised Learning: A Survey"
50 / 78 papers shown
Title
Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis
Radek Daněček
Carolin Schmitt
Senya Polikovsky
Michael J. Black
22
0
0
18 Apr 2025
Multifaceted Evaluation of Audio-Visual Capability for MLLMs: Effectiveness, Efficiency, Generalizability and Robustness
Yusheng Zhao
Junyu Luo
Xiao Luo
Weizhi Zhang
Zhiping Xiao
Wei Ju
Philip S. Yu
Ming Zhang
AuLLM
32
0
0
03 Apr 2025
Heterogeneous bimodal attention fusion for speech emotion recognition
Jiachen Luo
Huy Phan
Lin Wang
Joshua Reiss
42
0
0
09 Mar 2025
Masked Latent Prediction and Classification for Self-Supervised Audio Representation Learning
Aurian Quélennec
Pierre Chouteau
Geoffroy Peeters
S. Essid
SSL
47
0
0
17 Feb 2025
Evaluation of Deep Audio Representations for Hearables
Fabian Gröger
Pascal Baumann
L. Amruthalingam
Laurent Simon
Ruksana Giurda
Simone Lionetti
72
0
0
10 Feb 2025
A Review on Self-Supervised Learning for Time Series Anomaly Detection: Recent Advances and Open Challenges
Aitor Sánchez-Ferrera
Borja Calvo
Jose A. Lozano
AI4TS
35
0
0
28 Jan 2025
KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder
Maheswar Bora
Saurabh Atreya
Aritra Mukherjee
Abhijit Das
68
0
0
19 Nov 2024
BSS-CFFMA: Cross-Domain Feature Fusion and Multi-Attention Speech Enhancement Network based on Self-Supervised Embedding
Alimjan Mattursun
Liejun Wang
Yinfeng Yu
20
2
0
13 Aug 2024
Computer Audition: From Task-Specific Machine Learning to Foundation Models
Andreas Triantafyllopoulos
Iosif Tsangko
Alexander Gebhard
A. Mesaros
Tuomas Virtanen
Björn Schuller
39
4
0
22 Jul 2024
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models
Chun Yin
Tai-Shih Chi
Yu Tsao
Hsin-Min Wang
27
0
0
12 Jun 2024
The MuSe 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor Recognition
Shahin Amiriparian
Lukas Christ
Alexander Kathan
Maurice Gerczuk
Niklas Muller
...
Lukas Stappen
Andreas Konig
Erik Cambria
Björn Schuller
Simone Eulitz
27
8
0
11 Jun 2024
A review on discriminative self-supervised learning methods
Nikolaos Giakoumoglou
Tania Stathaki
SSL
34
0
0
08 May 2024
Fine-grained Speech Sentiment Analysis in Chinese Psychological Support Hotlines Based on Large-scale Pre-trained Model
Zhonglong Chen
Changwei Song
Yining Chen
Jianqiang Li
Guanghui Fu
Yongsheng Tong
Qing Zhao
AI4MH
24
0
0
07 May 2024
Self-supervised visual learning in the low-data regime: a comparative evaluation
Sotirios Konstantakos
Despina Ioanna Chalkiadaki
Ioannis Mademlis
Yuki M. Asano
E. Gavves
Georgios Th. Papadopoulos
19
6
0
26 Apr 2024
Spatial Context-based Self-Supervised Learning for Handwritten Text Recognition
Carlos Peñarrubia
Carlos Garrido-Munoz
J. J. Valero-Mas
Jorge Calvo-Zaragoza
19
1
0
17 Apr 2024
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization
Wei-Ping Huang
Sung-Feng Huang
Hung-yi Lee
16
0
0
23 Jan 2024
A novel dual-stream time-frequency contrastive pretext tasks framework for sleep stage classification
Sergio Kazatzidis
S. Mehrkanoon
AI4TS
6
1
0
15 Dec 2023
Self-Supervised Learning for Anomalous Sound Detection
Kevin Wilkinghoff
21
11
0
15 Dec 2023
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
Heng-Jui Chang
James R. Glass
10
3
0
15 Nov 2023
Rethinking Samples Selection for Contrastive Learning: Mining of Potential Samples
Hengkui Dong
Xianzhong Long
Yun Li
22
2
0
01 Nov 2023
Pre-trained Speech Processing Models Contain Human-Like Biases that Propagate to Speech Emotion Recognition
Isaac Slaughter
Craig Greenberg
Reva Schwartz
Aylin Caliskan
14
4
0
29 Oct 2023
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
Jiatong Shi
H. Inaguma
Xutai Ma
Ilia Kulikov
Anna Y. Sun
31
24
0
04 Oct 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
F. Ringeval
D. Schwab
Laurent Besacier
32
14
0
11 Sep 2023
Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction
Yusuf Brima
U. Krumnack
Simone Pika
Gunther Heidemann
SSL
11
0
0
07 Sep 2023
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Erik Cambria
Björn W. Schuller
LM&MA
AuLLM
24
36
0
24 Aug 2023
Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations
Yuewei Yang
Hai Helen Li
Yiran Chen
CML
OOD
17
1
0
16 Aug 2023
Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with Transformers
Lukas Rauch
Raphael Schwinger
Moritz Wirth
Bernhard Sick
Sven Tomforde
Christoph Scholz
14
4
0
14 Aug 2023
Noisy Self-Training with Data Augmentations for Offensive and Hate Speech Detection Tasks
João A. Leite
Carolina Scarton
D. F. Silva
14
0
0
31 Jul 2023
Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition
Weidong Chen
Xiaofen Xing
Peihao Chen
Xiangmin Xu
VLM
10
34
0
20 Jul 2023
ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development
Yanir Marmor
Kinneret Misgav
Y. Lifshitz
VLM
4
3
0
17 Jul 2023
Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects
Kexin Zhang
Qingsong Wen
Chaoli Zhang
Rongyao Cai
Ming Jin
...
James Y. Zhang
Y. Liang
Guansong Pang
Dongjin Song
Shirui Pan
AI4TS
109
97
0
16 Jun 2023
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks
Xian Li
Nian Shao
Xiaofei Li
ViT
CLIP
8
24
0
07 Jun 2023
N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition
Bashar Talafha
Abdul Waheed
Muhammad Abdul-Mageed
11
7
0
05 Jun 2023
MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations
Calum Heggan
Timothy M. Hospedales
S. Budgett
Mehrdad Yaghoobi
SSL
10
5
0
29 May 2023
Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Haomiao Yang
Jinming Zhao
Gholamreza Haffari
Ehsan Shareghi
9
6
0
28 May 2023
Martian time-series unraveled: A multi-scale nested approach with factorial variational autoencoders
Ali Siahkoohi
Rudy Morel
Randall Balestriero
Erwan Allys
G. Sainton
Taichi Kawamura
Maarten V. de Hoop
11
2
0
25 May 2023
Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
29
3
0
23 May 2023
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Kangwook Jang
Sungnyun Kim
Se-Young Yun
Hoi-Rim Kim
8
5
0
19 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
11
24
0
17 May 2023
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health Management: A Survey and Roadmaps
Yanfang Li
Huan Wang
Muxia Sun
LM&MA
AI4TS
AI4CE
11
44
0
10 May 2023
The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked Emotions, Cross-Cultural Humour, and Personalisation
Lukas Christ
Shahin Amiriparian
Alice Baird
Alexander Kathan
Niklas Muller
...
Eva-Maria Messner
Andreas Konig
Alan S. Cowen
Erik Cambria
Björn W. Schuller
6
30
0
05 May 2023
A vector quantized masked autoencoder for audiovisual speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
SSL
52
6
0
05 May 2023
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
31
270
0
24 Apr 2023
A vector quantized masked autoencoder for speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
8
20
0
21 Apr 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
72
152
0
21 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
20
3
0
07 Mar 2023
Phone and speaker spatial organization in self-supervised speech representations
Pablo Riera
M. Cerdeiro
L. Pepino
Luciana Ferrer
SSL
8
1
0
24 Feb 2023
Unearthing InSights into Mars: Unsupervised Source Separation with Limited Data
Ali Siahkoohi
Rudy Morel
Maarten V. de Hoop
Erwan Allys
G. Sainton
Taichi Kawamura
8
4
0
27 Jan 2023
DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech
Kazuki Kawamura
Jun Rekimoto
12
0
0
08 Dec 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
6
8
0
02 Nov 2022
1
2
Next