Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2109.15053
Cited By
v1
v2 (latest)
Fine-tuning wav2vec2 for speaker recognition
30 September 2021
Nik Vaessen
David A. van Leeuwen
Re-assign community
ArXiv (abs)
PDF
HTML
Github (145★)
Papers citing
"Fine-tuning wav2vec2 for speaker recognition"
50 / 51 papers shown
Dialect Identification Using Resource-Efficient Fine-Tuning Approaches
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2025
Zirui Lin
Haris Gulzar
Monnika Roslianna Busto
Akiko Masaki
Takeharu Eda
K. Nakadai
68
0
0
30 Nov 2025
XLSR-Kanformer: A KAN-Intergrated model for Synthetic Speech Detection
Advanced Video and Signal Based Surveillance (AVSS), 2025
Phuong Tuan Dat
Tran Huy Dat
102
1
0
08 Oct 2025
Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models
Tuan Dat Phuong
Long-Vu Hoang
Huy-Dat Tran
138
5
0
17 Jun 2025
Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues
IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2025
Rui Ribeiro
Luísa Coheur
Joao Paulo Carvalho
266
0
0
21 Apr 2025
Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Aneesha Sampath
James Tavernor
E. Provost
307
4
0
17 Feb 2025
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification
Bei Liu
Yanmin Qian
443
0
0
02 Dec 2024
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques
Applied Soft Computing (Appl. Soft Comput.), 2024
David Ortiz-Perez
Manuel Benavent-Lledo
José García Rodríguez
David Tomás
M. Flores Vizcaya-Moreno
231
3
0
24 Oct 2024
Layer-aware TDNN: Speaker Recognition Using Multi-Layer Features from Pre-Trained Models
Jin Sob Kim
Hyun Joon Park
Wooseok Shin
Juan Yun
Sung Won Han
SLR
454
2
0
12 Sep 2024
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024
Nakamasa Inoue
Shinta Otake
Takumi Hirose
Masanari Ohi
Rei Kawakami
247
5
0
28 Jul 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
Yi Zhu
Surya Koppisetti
Trang Tran
Gaurav Bharaj
400
22
0
26 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
344
23
0
21 Jul 2024
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Mengzhe Geng
Zengrui Jin
Jiajun Deng
...
Yi Wang
Mingyu Cui
Tianzi Wang
Helen Meng
Xunying Liu
248
30
0
03 Jul 2024
Target Speech Extraction with Pre-trained Self-supervised Learning Models
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
220
17
0
17 Feb 2024
Probing Self-supervised Learning Models with Target Speech Extraction
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Takanori Ashihara
Shoko Araki
J. Černocký
260
6
0
17 Feb 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
IEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Danwei Cai
Zexin Cai
Ze Li
Ming Li
290
2
0
03 Jan 2024
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Huimeng Wang
Zengrui Jin
Mengzhe Geng
Shujie Hu
Guinan Li
Tianzi Wang
Haoning Xu
Xunying Liu
163
36
0
01 Jan 2024
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference
ACM Multimedia (ACM MM), 2023
Dejan Porjazovski
Yaroslav Getman
Tamás Grósz
M. Kurimo
182
3
0
16 Oct 2023
Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Farhad Javanmardi
Saska Tirronen
Manila Kodali
Sudarsana Reddy Kadiri
P. Alku
199
45
0
25 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Computer Speech and Language (CSL), 2023
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
Fabien Ringeval
D. Schwab
Laurent Besacier
260
26
0
11 Sep 2023
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0
Biometrics and Electronic Signatures (BES), 2023
Oubaïda Chouchane
Michele Panariello
Chiara Galdi
Massimiliano Todisco
Nicholas W. D. Evans
156
4
0
27 Aug 2023
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification
Harunori Kawano
Sota Shimizu
146
1
0
22 Aug 2023
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
Applied Acoustics (Appl. Acoust.), 2023
Zirui Ge
Xinzhou Xu
Haiyan Guo
Tingting Wang
Zhen Yang
SSL
186
5
0
09 Aug 2023
Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals
Computer Speech and Language (CSL), 2023
Sudarsana Reddy Kadiri
Farhad Javanmardi
P. Alku
90
9
0
06 Aug 2023
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2023
Yuya Yamamoto
153
3
0
22 Jun 2023
Unsupervised speech intelligibility assessment with utterance level alignment distance between teacher and learner Wav2Vec-2.0 representations
Nayan Anand
Meenakshi Sirigiraju
Chiranjeevi Yarra
117
1
0
15 Jun 2023
Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Interspeech (Interspeech), 2023
Danilo de Oliveira
N. Prabhu
Timo Gerkmann
120
10
0
30 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Interspeech (Interspeech), 2023
Aoi Ito
Shota Horiguchi
SSL
134
4
0
24 May 2023
Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices
Ahlam Husni Abu Nada
S. Latif
Junaid Qadir
132
6
0
22 Apr 2023
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework
Zirui Ge
Haiyan Guo
Zhen Yang
163
1
0
19 Mar 2023
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning
IEEE International Conference on Computer Vision (ICCV), 2023
Xiaobao Guo
Nithish Muthuchamy Selvaraj
Zitong Yu
A. Kong
Bingquan Shen
Alex C. Kot
98
23
0
09 Mar 2023
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Shujie Hu
Xurong Xie
Zengrui Jin
Mengzhe Geng
Yi Wang
Mingyu Cui
Jiajun Deng
Xunying Liu
Helen M. Meng
142
46
0
28 Feb 2023
Towards multi-task learning of speech and speaker recognition
Interspeech (Interspeech), 2023
Nik Vaessen
David A. van Leeuwen
CVBM
160
0
0
24 Feb 2023
Speaker and Language Change Detection using Wav2vec2 and Whisper
Tijn Berns
Nik Vaessen
David A. van Leeuwen
161
6
0
18 Feb 2023
Residual Information in Deep Speaker Embedding Architectures
Adriana Stan
151
6
0
06 Feb 2023
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Shinta Otake
Rei Kawakami
Nakamasa Inoue
172
21
0
06 Dec 2022
Multi-Label Training for Text-Independent Speaker Identification
Yuqi Xue
142
0
0
14 Nov 2022
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
169
5
0
04 Nov 2022
Dynamic Kernels and Channel Attention for Low Resource Speaker Verification
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
118
0
0
03 Nov 2022
Application of Knowledge Distillation to Multi-task Speech Representation Learning
Interspeech (Interspeech), 2022
Mine Kerpicci
V. Nguyen
Shuhua Zhang
Erik M. Visser
179
0
0
29 Oct 2022
Universal speaker recognition encoders for different speech segments duration
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Sergey Novoselov
V. Volokhov
G. Lavrentyeva
88
2
0
28 Oct 2022
Fast Yet Effective Speech Emotion Recognition with Self-distillation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhao Ren
Thanh Tam Nguyen
Yi Chang
Björn W. Schuller
109
16
0
26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Evonne Lee
Guangzhi Sun
Chuxu Zhang
P. Woodland
167
1
0
24 Oct 2022
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
195
6
0
20 Oct 2022
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Spoken Language Technology Workshop (SLT), 2022
Themos Stafylakis
Ladislav Mošner
Sofoklis Kakouros
Oldrich Plchot
L. Burget
J. Černocký
SSL
116
12
0
15 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
101
3
0
01 Oct 2022
Speech Emotion: Investigating Model Representations, Multi-Task Learning and Knowledge Distillation
Interspeech (Interspeech), 2022
Vikramjit Mitra
H. Chien
Vasudha Kowtha
Joseph Y. Cheng
Erdrin Azemi
136
8
0
02 Jul 2022
Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition
IEEE Transactions on Dependable and Secure Computing (TDSC), 2022
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Feng Wang
Jiashui Wang
AAML
234
47
0
07 Jun 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Sergey Novoselov
G. Lavrentyeva
Anastasia Avdeeva
V. Volokhov
Aleksei Gusev
ViT
97
21
0
28 Mar 2022
Training speaker recognition systems with limited data
Interspeech (Interspeech), 2022
Nik Vaessen
David A. van Leeuwen
196
6
0
28 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
The Speaker and Language Recognition Workshop (Odyssey), 2022
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
338
247
0
24 Feb 2022
1
2
Next