ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.15188
  4. Cited By
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

21 July 2024
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
ArXivPDFHTML

Papers citing "Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning"

26 / 26 papers shown
Title
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing
Tianchi Liu
Duc-Tuan Truong
Rohan Kumar Das
K. Lee
Haizhou Li
26
0
0
08 Apr 2025
Causal Self-supervised Pretrained Frontend with Predictive Code for Speech Separation
Causal Self-supervised Pretrained Frontend with Predictive Code for Speech Separation
Wupeng Wang
Zexu Pan
X. Li
Shuai Wang
Haizhou Li
AI4TS
29
0
0
03 Apr 2025
USED: Universal Speaker Extraction and Diarization
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
33
5
0
17 Jan 2025
M-Vec: Matryoshka Speaker Embeddings with Flexible Dimensions
M-Vec: Matryoshka Speaker Embeddings with Flexible Dimensions
Shuai Wang
Pengcheng Zhu
Haizhou Li
16
0
0
24 Sep 2024
An Attribute Interpolation Method in Speech Synthesis by Model Merging
An Attribute Interpolation Method in Speech Synthesis by Model Merging
Masato Murata
Koichi Miyazaki
Tomoki Koriyama
MoMe
27
2
0
30 Jun 2024
CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds
CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds
David Budaghyan
Charles C. Onu
Arsenii Gorin
Cem Subakan
Doina Precup
32
2
0
01 May 2023
X-SepFormer: End-to-end Speaker Extraction Network with Explicit
  Optimization on Speaker Confusion
X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker Confusion
Kai Liu
Z.C. Du
Xucheng Wan
Huan Zhou
34
18
0
09 Mar 2023
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker
  Verification
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
24
4
0
02 Mar 2023
CAM++: A Fast and Efficient Network for Speaker Verification Using
  Context-Aware Masking
CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking
Haibo Wang
Siqi Zheng
Yafeng Chen
Luyao Cheng
Qian Chen
33
68
0
01 Mar 2023
Target-Speaker Voice Activity Detection via Sequence-to-Sequence
  Prediction
Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction
Ming Cheng
Weiqing Wang
Yucong Zhang
Xiaoyi Qin
Ming Li
VLM
31
29
0
28 Oct 2022
Parameter-efficient transfer learning of pre-trained Transformer models
  for speaker verification using adapters
Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters
Junyi Peng
Themos Stafylakis
Rongzhi Gu
Oldvrich Plchot
Ladislav Movsner
Lukávs Burget
JanHonza'' vCernocký
29
22
0
28 Oct 2022
A comprehensive study on self-supervised distillation for speaker
  representation learning
A comprehensive study on self-supervised distillation for speaker representation learning
Zhengyang Chen
Yao Qian
Bing Han
Y. Qian
Michael Zeng
SSL
28
13
0
28 Oct 2022
Speaker recognition with two-step multi-modal deep cleansing
Speaker recognition with two-step multi-modal deep cleansing
Ruijie Tao
Kong Aik Lee
Zhan Shi
Haizhou Li
NoLa
30
13
0
28 Oct 2022
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse
  Positive Pairs
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs
Ruijie Tao
Kong Aik Lee
Rohan Kumar Das
Ville Hautamaki
Haizhou Li
SSL
22
8
0
27 Oct 2022
Fine-tuning wav2vec2 for speaker recognition
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
32
107
0
30 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning
A Review of Speaker Diarization: Recent Advances with Deep Learning
Tae Jin Park
Naoyuki Kanda
Dimitrios Dimitriadis
Kyu Jeong Han
Shinji Watanabe
Shrikanth Narayanan
VLM
260
323
0
24 Jan 2021
Exploring wav2vec 2.0 on speaker verification and language
  identification
Exploring wav2vec 2.0 on speaker verification and language identification
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
101
169
0
11 Dec 2020
Augmentation adversarial training for self-supervised speaker
  recognition
Augmentation adversarial training for self-supervised speaker recognition
Jaesung Huh
Hee-Soo Heo
Jingu Kang
Shinji Watanabe
Joon Son Chung
SSL
42
74
0
23 Jul 2020
The FFSVC 2020 Evaluation Plan
The FFSVC 2020 Evaluation Plan
Xiaoyi Qin
Ming Li
Hui Bu
Rohan Kumar Das
Wei Rao
Shrikanth Narayanan
Haizhou Li
25
21
0
02 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
Probing the Information Encoded in X-vectors
Probing the Information Encoded in X-vectors
Desh Raj
David Snyder
Daniel Povey
Sanjeev Khudanpur
32
84
0
13 Sep 2019
End-to-End Neural Speaker Diarization with Permutation-Free Objectives
End-to-End Neural Speaker Diarization with Permutation-Free Objectives
Yusuke Fujita
Naoyuki Kanda
Shota Horiguchi
Kenji Nagamatsu
Shinji Watanabe
145
242
0
12 Sep 2019
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
206
1,954
0
14 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
196
817
0
12 Jun 2018
Voices Obscured in Complex Environmental Settings (VOICES) corpus
Voices Obscured in Complex Environmental Settings (VOICES) corpus
Colleen Richey
Maria Artigas
Zeb Armstrong
C. Bartels
H. Franco
...
Julien van Hout
Paul Gamble
Jeff Hetherly
Cory Stephenson
Karl S. Ni
26
124
0
13 Apr 2018
1