v1v2 (latest)

Exploring wav2vec 2.0 on speaker verification and language identification

Interspeech (Interspeech), 2020

11 December 2020

Bo Xu

Papers citing "Exploring wav2vec 2.0 on speaker verification and language identification"

50 / 108 papers shown

Transcribe, Translate, or Transliterate: An Investigation of Intermediate Representations in Spoken Language Models

Tolúl\d{o}pé Ògúnrèmí

Christopher D. Manning

Dan Jurafsky

Karen Livescu

AuLLM

245

02 Oct 2025

XMUspeech Systems for the ASVspoof 5 Challenge

169

05 Sep 2025

Evaluating the Effectiveness of Transformer Layers in Wav2Vec 2.0, XLS-R, and Whisper for Speaker Identification Tasks

Linus Stuhlmann

Michael Alexander Saxer

29 Aug 2025

Layer-Wise Analysis of Self-Supervised Representations for Age and Gender Classification in Children's SpeechWorkshop on Child, Computer and Interaction (CCI), 2025

Sudarsana Reddy Kadiri

14 Aug 2025

SpeechVerifier: Robust Acoustic Fingerprint against Tampering Attacks via Watermarking

291

28 May 2025

Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning

Abdulhady Abas Abdullah

982

23 Apr 2025

Respiratory Inhaler Sound Event Classification Using Self-Supervised Learning

Davoud Shariat Panah

Alessandro N Franciosi

Cormac McCarthy

Andrew Hines

155

15 Apr 2025

Exploring Modality Disruption in Multimodal Fake News Detection

419

12 Apr 2025

From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-SpeechComputer Vision and Pattern Recognition (CVPR), 2025

396

21 Mar 2025

A Dual-Stage Time-Context Network for Speech-Based Alzheimer's Disease Detection

Yifan Gao

Long Guo

Hong Liu

283

18 Feb 2025

CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker GenerationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024

376

31 Dec 2024

Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification

Bei Liu

Yanmin Qian

471

02 Dec 2024

LLM-Ref: Enhancing Reference Handling in Technical Writing with Large Language Models

Kazi Ahmed Asif Fuad

Lizhong Chen

387

01 Nov 2024

Do Discrete Self-Supervised Representations of Speech Capture Tone Distinctions?

Opeyemi Osakuade

Simon King

257

25 Oct 2024

Layer-aware TDNN: Speaker Recognition Using Multi-Layer Features from Pre-Trained Models

Jin Sob Kim

Hyun Joon Park

Wooseok Shin

Juan Yun

Sung Won Han

SLR

500

12 Sep 2024

ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing TasksIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024

280

28 Jul 2024

SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection

460

26 Jul 2024

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

Shuai Wang

Zheng-Shou Chen

Kong Aik Lee

Yan-min Qian

Haizhou Li

374

21 Jul 2024

Universal Sound Separation with Self-Supervised Audio Masked Autoencoder

284

16 Jul 2024

A Layer-Anchoring Strategy for Enhancing Cross-Lingual Speech Emotion Recognition

Shreya G. Upadhyay

John H. L. Hansen

Chi-Chun Lee

298

06 Jul 2024

Speech Representation Analysis based on Inter- and Intra-Model Similarities

342

23 Jun 2024

Articulatory Encodec: Coding Speech through Vocal Tract KinematicsIEEE Journal on Selected Topics in Signal Processing (JSTSP), 2024

Cheol Jun Cho

Peter Wu

Tejas S. Prabhune

Dhruv Agarwal

Gopala K. Anumanchipalli

353

18 Jun 2024

Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for Anti-spoofing Detection

316

12 Jun 2024

Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models

Victor Miara

Theo Lepage

Reda Dehak

264

04 Jun 2024

A Large-Scale Evaluation of Speech Foundation Models

...

Shinji Watanabe

Hung-yi Lee

309

15 Apr 2024

SKILL: Similarity-aware Knowledge distILLation for Speech
Self-Supervised Learning

Luca Zampierin

G. B. Hacene

Bac Nguyen

Mirco Ravanelli

322

26 Feb 2024

Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?

Stephen Shum

Ahmed Hussen Abdelaziz

Shinji Watanabe

B. Theobald

SSL

215

01 Feb 2024

Singer Identity Representation Learning using Self-Supervised TechniquesInternational Society for Music Information Retrieval Conference (ISMIR), 2024

346

10 Jan 2024

Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation LearningIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024

369

03 Jan 2024

Generative linguistic representation for spoken language identification

Peng Shen

Xuguang Lu

Hisashi Kawai

179

18 Dec 2023

On the Behavior of Audio-Visual Fusion Architectures in Identity Verification Tasks

Daniel Claborne

Eric Slyman

Karl Pazdernik

187

09 Nov 2023

Automatic Pronunciation Assessment -- A ReviewConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yassine El Kheir

Ahmed M. Ali

Shammur A. Chowdhury

258

21 Oct 2023

Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract VariablesEuropean Signal Processing Conference (EUSIPCO), 2023

Ahmed Adel Attia

Yashish M. Siriwardena

Carol Espy-Wilson

SSL

251

17 Sep 2023

Let There Be Sound: Reconstructing High Quality Speech from Silent VideosAAAI Conference on Artificial Intelligence (AAAI), 2023

Ji-Hoon Kim

Jaehun Kim

Joon Son Chung

329

29 Aug 2023

Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0Biometrics and Electronic Signatures (BES), 2023

207

27 Aug 2023

Implicit Self-supervised Language Representation for Spoken Language DiarizationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Student Member Ieee Jagabandhu Mishra

S. M. I. S. R. Mahadeva Prasanna

195

21 Aug 2023

Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis DistanceInterspeech (Interspeech), 2023

180

09 Aug 2023

Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised RepresentationApplied Acoustics (Appl. Acoust.), 2023

222

09 Aug 2023

Comparative Analysis of the wav2vec 2.0 Feature Extractor

Peter Vieting

Ralf Schluter

Hermann Ney

269

08 Aug 2023

Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer SignalsComputer Speech and Language (CSL), 2023

Sudarsana Reddy Kadiri

Farhad Javanmardi

P. Alku

120

06 Aug 2023

Towards spoken dialect identification of Irish

14 Jul 2023

Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure

Yikang Wang

Hiromitsu Nishizaki

Ming Li

221

04 Jul 2023

What Do Self-Supervised Speech Models Know About Words?Transactions of the Association for Computational Linguistics (TACL), 2023

576

30 Jun 2023

Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition

240

26 Jun 2023

Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case StudiesAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2023

Yuya Yamamoto

221

22 Jun 2023

Unsupervised speech intelligibility assessment with utterance level alignment distance between teacher and learner Wav2Vec-2.0 representations

Nayan Anand

Meenakshi Sirigiraju

Chiranjeevi Yarra

144

15 Jun 2023

What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification ModelInterspeech (Interspeech), 2023

345

10 Jun 2023

Label Aware Speech Representation Learning For Language IdentificationInterspeech (Interspeech), 2023

Shikhar Vashishth

Shikhar Bharadwaj

Sriram Ganapathy

175

07 Jun 2023

Investigating model performance in language identification: beyond simple error statisticsInterspeech (Interspeech), 2023

Leibny Paola García Perera

Sanjeev Khudanpur

Andy W. H. Khong

Justin Dauwels

158

30 May 2023

From `Snippet-lects' to Doculects and Dialects: Leveraging Neural Representations of Speech for Placing Audio Signals in a Language Landscape

Severine Guillaume

Guillaume Wisniewski

Alexis Michaud

181

29 May 2023