ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.06185
  4. Cited By
Exploring wav2vec 2.0 on speaker verification and language
  identification
v1v2 (latest)

Exploring wav2vec 2.0 on speaker verification and language identification

Interspeech (Interspeech), 2020
11 December 2020
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
ArXiv (abs)PDFHTML

Papers citing "Exploring wav2vec 2.0 on speaker verification and language identification"

50 / 108 papers shown
Transcribe, Translate, or Transliterate: An Investigation of Intermediate Representations in Spoken Language Models
Transcribe, Translate, or Transliterate: An Investigation of Intermediate Representations in Spoken Language Models
Tolúl\d{o}pé Ògúnrèmí
Christopher D. Manning
Dan Jurafsky
Karen Livescu
AuLLM
245
1
0
02 Oct 2025
XMUspeech Systems for the ASVspoof 5 Challenge
XMUspeech Systems for the ASVspoof 5 Challenge
W. Li
Xingjia Xie
Yishuang Li
Wenhao Guan
Kaidi Wang
Pengyu Ren
Lin Li
Q. Hong
169
0
0
05 Sep 2025
Evaluating the Effectiveness of Transformer Layers in Wav2Vec 2.0, XLS-R, and Whisper for Speaker Identification Tasks
Evaluating the Effectiveness of Transformer Layers in Wav2Vec 2.0, XLS-R, and Whisper for Speaker Identification Tasks
Linus Stuhlmann
Michael Alexander Saxer
85
0
0
29 Aug 2025
Layer-Wise Analysis of Self-Supervised Representations for Age and Gender Classification in Children's Speech
Layer-Wise Analysis of Self-Supervised Representations for Age and Gender Classification in Children's SpeechWorkshop on Child, Computer and Interaction (CCI), 2025
Abhijit Sinha
Harishankar Kumar
Mohit Joshi
H. Kathania
Shrikanth Narayanan
Sudarsana Reddy Kadiri
77
0
0
14 Aug 2025
SpeechVerifier: Robust Acoustic Fingerprint against Tampering Attacks via Watermarking
SpeechVerifier: Robust Acoustic Fingerprint against Tampering Attacks via Watermarking
Lingfeng Yao
Chenpei Huang
Shengyao Wang
Junpei Xue
Hanqing Guo
Jiang Liu
Hang Zhang
Miao Pan
291
2
0
28 May 2025
Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning
Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning
Abdulhady Abas Abdullah
S. H. Karim
Sara Azad Ahmed
Kanar R. Tariq
Tarik Ahmed Rashid
982
3
0
23 Apr 2025
Respiratory Inhaler Sound Event Classification Using Self-Supervised Learning
Respiratory Inhaler Sound Event Classification Using Self-Supervised Learning
Davoud Shariat Panah
Alessandro N Franciosi
Cormac McCarthy
Andrew Hines
155
0
0
15 Apr 2025
Exploring Modality Disruption in Multimodal Fake News Detection
Exploring Modality Disruption in Multimodal Fake News Detection
Moyang Liu
Kaiying Yan
Yukun Liu
Ruibo Fu
Zhengqi Wen
Zhengqi Wen
Chenxing Li
419
2
0
12 Apr 2025
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-SpeechComputer Vision and Pattern Recognition (CVPR), 2025
Ji-Hoon Kim
Jeongsoo Choi
Jaehun Kim
Chaeyoung Jung
Joon Son Chung
CVBM
396
6
0
21 Mar 2025
A Dual-Stage Time-Context Network for Speech-Based Alzheimer's Disease Detection
A Dual-Stage Time-Context Network for Speech-Based Alzheimer's Disease Detection
Yifan Gao
Long Guo
Hong Liu
283
0
0
18 Feb 2025
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker GenerationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024
Ji-Hoon Kim
Hong-Sun Yang
Yoon-Cheol Ju
Il-Hwan Kim
Byeong-Yeol Kim
Joon Son Chung
BDL
376
1
0
31 Dec 2024
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker
  Verification
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification
Bei Liu
Yanmin Qian
471
0
0
02 Dec 2024
LLM-Ref: Enhancing Reference Handling in Technical Writing with Large
  Language Models
LLM-Ref: Enhancing Reference Handling in Technical Writing with Large Language Models
Kazi Ahmed Asif Fuad
Lizhong Chen
387
2
0
01 Nov 2024
Do Discrete Self-Supervised Representations of Speech Capture Tone
  Distinctions?
Do Discrete Self-Supervised Representations of Speech Capture Tone Distinctions?
Opeyemi Osakuade
Simon King
257
2
0
25 Oct 2024
Layer-aware TDNN: Speaker Recognition Using Multi-Layer Features from Pre-Trained Models
Layer-aware TDNN: Speaker Recognition Using Multi-Layer Features from Pre-Trained Models
Jin Sob Kim
Hyun Joon Park
Wooseok Shin
Juan Yun
Sung Won Han
SLR
500
2
0
12 Sep 2024
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech
  Processing Tasks
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing TasksIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024
Nakamasa Inoue
Shinta Otake
Takumi Hirose
Masanari Ohi
Rei Kawakami
280
9
0
28 Jul 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake
  Detection
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
Yi Zhu
Surya Koppisetti
Trang Tran
Gaurav Bharaj
460
28
0
26 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
374
28
0
21 Jul 2024
Universal Sound Separation with Self-Supervised Audio Masked Autoencoder
Universal Sound Separation with Self-Supervised Audio Masked Autoencoder
Junqi Zhao
Xubo Liu
Jinzheng Zhao
Yiitan Yuan
Qiuqiang Kong
Mark D. Plumbley
Wenwu Wang
284
6
0
16 Jul 2024
A Layer-Anchoring Strategy for Enhancing Cross-Lingual Speech Emotion
  Recognition
A Layer-Anchoring Strategy for Enhancing Cross-Lingual Speech Emotion Recognition
Shreya G. Upadhyay
John H. L. Hansen
Chi-Chun Lee
298
7
0
06 Jul 2024
Speech Representation Analysis based on Inter- and Intra-Model
  Similarities
Speech Representation Analysis based on Inter- and Intra-Model Similarities
Yassine El Kheir
Ahmed M. Ali
Shammur A. Chowdhury
SSL
342
6
0
23 Jun 2024
Articulatory Encodec: Coding Speech through Vocal Tract Kinematics
Articulatory Encodec: Coding Speech through Vocal Tract KinematicsIEEE Journal on Selected Topics in Signal Processing (JSTSP), 2024
Cheol Jun Cho
Peter Wu
Tejas S. Prabhune
Dhruv Agarwal
Gopala K. Anumanchipalli
353
9
0
18 Jun 2024
Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for
  Anti-spoofing Detection
Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for Anti-spoofing Detection
Zihan Pan
Tianchi Liu
Hardik B. Sailor
Qiongqiong Wang
316
28
0
12 Jun 2024
Towards Supervised Performance on Speaker Verification with
  Self-Supervised Learning by Leveraging Large-Scale ASR Models
Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models
Victor Miara
Theo Lepage
Reda Dehak
264
7
0
04 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
309
60
0
15 Apr 2024
SKILL: Similarity-aware Knowledge distILLation for Speech
  Self-Supervised Learning
SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning
Luca Zampierin
G. B. Hacene
Bac Nguyen
Mirco Ravanelli
322
4
0
26 Feb 2024
Can you Remove the Downstream Model for Speaker Recognition with
  Self-Supervised Speech Features?
Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?
Zakaria Aldeneh
Takuya Higuchi
Jee-weon Jung
Skyler Seto
Tatiana Likhomanenko
Stephen Shum
Ahmed Hussen Abdelaziz
Shinji Watanabe
B. Theobald
SSL
215
5
0
01 Feb 2024
Singer Identity Representation Learning using Self-Supervised Techniques
Singer Identity Representation Learning using Self-Supervised TechniquesInternational Society for Music Information Retrieval Conference (ISMIR), 2024
Bernardo Torres
Stefan Lattner
Gaël Richard
SSL
346
14
0
10 Jan 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation LearningIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Danwei Cai
Zexin Cai
Ze Li
Ming Li
369
4
0
03 Jan 2024
Generative linguistic representation for spoken language identification
Generative linguistic representation for spoken language identification
Peng Shen
Xuguang Lu
Hisashi Kawai
179
1
0
18 Dec 2023
On the Behavior of Audio-Visual Fusion Architectures in Identity
  Verification Tasks
On the Behavior of Audio-Visual Fusion Architectures in Identity Verification Tasks
Daniel Claborne
Eric Slyman
Karl Pazdernik
187
0
0
09 Nov 2023
Automatic Pronunciation Assessment -- A Review
Automatic Pronunciation Assessment -- A ReviewConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yassine El Kheir
Ahmed M. Ali
Shammur A. Chowdhury
258
19
0
21 Oct 2023
Improving Speech Inversion Through Self-Supervised Embeddings and
  Enhanced Tract Variables
Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract VariablesEuropean Signal Processing Conference (EUSIPCO), 2023
Ahmed Adel Attia
Yashish M. Siriwardena
Carol Espy-Wilson
SSL
251
15
0
17 Sep 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent
  Videos
Let There Be Sound: Reconstructing High Quality Speech from Silent VideosAAAI Conference on Artificial Intelligence (AAAI), 2023
Ji-Hoon Kim
Jaehun Kim
Joon Son Chung
329
11
0
29 Aug 2023
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences
  Using wav2vec 2.0
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0Biometrics and Electronic Signatures (BES), 2023
Oubaïda Chouchane
Michele Panariello
Chiara Galdi
Massimiliano Todisco
Nicholas W. D. Evans
207
5
0
27 Aug 2023
Implicit Self-supervised Language Representation for Spoken Language
  Diarization
Implicit Self-supervised Language Representation for Spoken Language DiarizationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Student Member Ieee Jagabandhu Mishra
S. M. I. S. R. Mahadeva Prasanna
195
1
0
21 Aug 2023
Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis
  Distance
Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis DistanceInterspeech (Interspeech), 2023
Sourya Dipta Das
Yash Vadi
Abhishek Unnam
Kuldeep Yadav
180
3
0
09 Aug 2023
Speaker Recognition Using Isomorphic Graph Attention Network Based
  Pooling on Self-Supervised Representation
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised RepresentationApplied Acoustics (Appl. Acoust.), 2023
Zirui Ge
Xinzhou Xu
Haiyan Guo
Tingting Wang
Zhen Yang
SSL
222
5
0
09 Aug 2023
Comparative Analysis of the wav2vec 2.0 Feature Extractor
Comparative Analysis of the wav2vec 2.0 Feature Extractor
Peter Vieting
Ralf Schluter
Hermann Ney
269
5
0
08 Aug 2023
Investigation of Self-supervised Pre-trained Models for Classification
  of Voice Quality from Speech and Neck Surface Accelerometer Signals
Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer SignalsComputer Speech and Language (CSL), 2023
Sudarsana Reddy Kadiri
Farhad Javanmardi
P. Alku
120
11
0
06 Aug 2023
Towards spoken dialect identification of Irish
Towards spoken dialect identification of Irish
Liam Lonergan
Mengjie Qian
Neasa Ní Chiaráin
Christer Gobl
A. N. Chasaide
98
6
0
14 Jul 2023
Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Yikang Wang
Hiromitsu Nishizaki
Ming Li
221
1
0
04 Jul 2023
What Do Self-Supervised Speech Models Know About Words?
What Do Self-Supervised Speech Models Know About Words?Transactions of the Association for Computational Linguistics (TACL), 2023
Ankita Pasad
C. Chien
Shane Settle
Karen Livescu
SSL
576
62
0
30 Jun 2023
Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech
  Emotion Recognition
Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition
Samuel Cahyawijaya
Holy Lovenia
Willy Chung
Rita Frieske
Zihan Liu
Pascale Fung
240
2
0
26 Jun 2023
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic
  Singing Voice Understanding Tasks: Three Case Studies
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case StudiesAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2023
Yuya Yamamoto
221
3
0
22 Jun 2023
Unsupervised speech intelligibility assessment with utterance level
  alignment distance between teacher and learner Wav2Vec-2.0 representations
Unsupervised speech intelligibility assessment with utterance level alignment distance between teacher and learner Wav2Vec-2.0 representations
Nayan Anand
Meenakshi Sirigiraju
Chiranjeevi Yarra
144
1
0
15 Jun 2023
What Can an Accent Identifier Learn? Probing Phonetic and Prosodic
  Information in a Wav2vec2-based Accent Identification Model
What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification ModelInterspeech (Interspeech), 2023
Mu Yang
R. Shekar
Okim Kang
John H. L. Hansen
345
28
0
10 Jun 2023
Label Aware Speech Representation Learning For Language Identification
Label Aware Speech Representation Learning For Language IdentificationInterspeech (Interspeech), 2023
Shikhar Vashishth
Shikhar Bharadwaj
Sriram Ganapathy
Ankur Bapna
Min Ma
Wei Han
Vera Axelrod
Partha P. Talukdar
SSL
175
4
0
07 Jun 2023
Investigating model performance in language identification: beyond
  simple error statistics
Investigating model performance in language identification: beyond simple error statisticsInterspeech (Interspeech), 2023
S. Styles
Victoria Y. H. Chua
Fei Ting Woon
Hexin Liu
Leibny Paola García Perera
Sanjeev Khudanpur
Andy W. H. Khong
Justin Dauwels
158
4
0
30 May 2023
From `Snippet-lects' to Doculects and Dialects: Leveraging Neural
  Representations of Speech for Placing Audio Signals in a Language Landscape
From `Snippet-lects' to Doculects and Dialects: Leveraging Neural Representations of Speech for Placing Audio Signals in a Language Landscape
Severine Guillaume
Guillaume Wisniewski
Alexis Michaud
181
4
0
29 May 2023
123
Next
Page 1 of 3