ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.07272
  4. Cited By
Target-Speaker Voice Activity Detection: a Novel Approach for
  Multi-Speaker Diarization in a Dinner Party Scenario
v1v2 (latest)

Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario

14 May 2020
Ivan Medennikov
M. Korenevsky
Tatiana Prisyach
Yuri Y. Khokhlov
Mariya Korenevskaya
Ivan Sorokin
Tatiana Timofeeva
Anton Mitrofanov
A. Andrusenko
Ivan Podluzhny
A. Laptev
A. Romanenko
ArXiv (abs)PDFHTML

Papers citing "Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario"

50 / 124 papers shown
Title
Spatio-spectral diarization of meetings by combining TDOA-based segmentation and speaker embedding-based clustering
Spatio-spectral diarization of meetings by combining TDOA-based segmentation and speaker embedding-based clustering
Tobias Cord-Landwehr
Tobias Gburrek
Marc Deegen
Reinhold Haeb-Umbach
7
0
0
19 Jun 2025
Exploring Speaker Diarization with Mixture of Experts
Exploring Speaker Diarization with Mixture of Experts
Gaobin Yang
Maokui He
Shutong Niu
Ruoyu Wang
Hang Chen
Jun Du
MoE
22
0
0
17 Jun 2025
SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition
SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition
Yuta Hirano
Sakriani Sakti
14
0
0
15 Jun 2025
Mitigating Non-Target Speaker Bias in Guided Speaker Embedding
Mitigating Non-Target Speaker Bias in Guided Speaker Embedding
Shota Horiguchi
Takanori Ashihara
Marc Delcroix
Atsushi Ando
Naohiro Tawara
15
0
0
14 Jun 2025
Towards Robust Overlapping Speech Detection: A Speaker-Aware Progressive Approach Using WavLM
Towards Robust Overlapping Speech Detection: A Speaker-Aware Progressive Approach Using WavLM
Zhaokai Sun
Li Zhang
Qing Wang
Pan Zhou
Lei Xie
VLM
27
0
0
29 May 2025
Summary of the NOTSOFAR-1 Challenge: Highlights and Learnings
Summary of the NOTSOFAR-1 Challenge: Highlights and Learnings
Igor Abramovski
Alon Vinnikov
Shalev Shaer
Naoyuki Kanda
Xiaofei Wang
Amir Ivry
Eyal Krupka
140
1
0
28 Jan 2025
USED: Universal Speaker Extraction and Diarization
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
101
6
0
17 Jan 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
107
1
0
13 Jan 2025
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
H. S. Bovbjerg
Jan Østergaard
Jesper Jensen
Zheng-Hua Tan
111
0
0
06 Jan 2025
Guided Speaker Embedding
Guided Speaker Embedding
Shota Horiguchi
Takafumi Moriya
Atsushi Ando
Takanori Ashihara
Hiroshi Sato
Naohiro Tawara
Marc Delcroix
124
1
0
03 Jan 2025
Joint Training of Speaker Embedding Extractor, Speech and Overlap
  Detection for Diarization
Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization
Petr Pálka
Federico Landini
Dominik Klement
Mireia Díez
Anna Silnova
Marc Delcroix
L. Burget
VLM
68
0
0
04 Nov 2024
Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization
Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization
Mao-Kui He
Jun Du
Shu-Tong Niu
Qing-Feng Liu
Chin-Hui Lee
62
1
0
15 Oct 2024
Investigation of Speaker Representation for Target-Speaker Speech
  Processing
Investigation of Speaker Representation for Target-Speaker Speech Processing
Takanori Ashihara
Takafumi Moriya
Shota Horiguchi
Junyi Peng
Tsubasa Ochiai
Marc Delcroix
Kohei Matsuura
Hiroshi Sato
55
1
0
15 Oct 2024
Target word activity detector: An approach to obtain ASR word boundaries
  without lexicon
Target word activity detector: An approach to obtain ASR word boundaries without lexicon
S. Sivasankaran
Eric Sun
Jinyu Li
Yan-ping Huang
Jing Pan
48
0
0
20 Sep 2024
Unified Audio Event Detection
Unified Audio Event Detection
Yidi Jiang
Ruijie Tao
Wen Huang
Qian Chen
Wen Wang
86
1
0
13 Sep 2024
Sortformer: Seamless Integration of Speaker Diarization and ASR by
  Bridging Timestamps and Tokens
Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens
Taejin Park
Ivan Medennikov
Kunal Dhawan
Weiqing Wang
He Huang
Nithin Rao Koluguri
Krishna Puvvada
Jagadeesh Balam
Boris Ginsburg
93
5
0
10 Sep 2024
Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow
  Matching
Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Zhengyang Chen
Bing Han
Shuai Wang
Yidi Jiang
Yanmin Qian
91
0
0
07 Sep 2024
Focus Agent: LLM-Powered Virtual Focus Group
Focus Agent: LLM-Powered Virtual Focus Group
Taiyu Zhang
Xuesong Zhang
Robbe Cools
Adalberto L. Simeone
LLMAG
65
2
0
03 Sep 2024
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant
  Multi-Talker Speech Separation, ASR and Speaker Diarization
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
Zengrui Jin
Yifan Yang
Mohan Shi
Wei Kang
Xiaoyu Yang
...
Lingwei Meng
Long Lin
Yong Xu
Shi-Xiong Zhang
Daniel Povey
76
3
0
01 Sep 2024
The VoxCeleb Speaker Recognition Challenge: A Retrospective
The VoxCeleb Speaker Recognition Challenge: A Retrospective
Jaesung Huh
Joon Son Chung
Arsha Nagrani
A. Brown
Jee-weon Jung
Daniel Garcia-Romero
Andrew Zisserman
80
5
0
27 Aug 2024
Generating Data with Text-to-Speech and Large-Language Models for
  Conversational Speech Recognition
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition
Samuele Cornell
Jordan Darefsky
Zhiyao Duan
Shinji Watanabe
SyDa
91
5
0
17 Aug 2024
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant
  Automatic Speech Recognition and Diarization
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization
Samuele Cornell
Taejin Park
Steve Huang
Christoph Boeddeker
Xuankai Chang
Matthew Maciejewski
Sanjeev Khudanpur
Paola García
Shinji Watanabe
84
13
0
23 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
112
6
0
21 Jul 2024
TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR
TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR
Shashi Kumar
S. Madikeri
Juan Zuluaga-Gomez
Iuliia Nigmatulina
Esaú Villatoro-Tello
Sergio Burdisso
P. Motlícek
Karthik Pandia
A. Ganapathiraju
92
0
0
05 Jul 2024
Leveraging Speaker Embeddings in End-to-End Neural Diarization for
  Two-Speaker Scenarios
Leveraging Speaker Embeddings in End-to-End Neural Diarization for Two-Speaker Scenarios
Juan Ignacio Alvarez-Trejos
Beltrán Labrador
Alicia Lozano-Diez
93
2
0
01 Jul 2024
Song Data Cleansing for End-to-End Neural Singer Diarization Using
  Neural Analysis and Synthesis Framework
Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework
Hokuto Munakata
Ryo Terashima
Yusuke Fujita
76
0
0
24 Jun 2024
Comparative Analysis of Personalized Voice Activity Detection Systems:
  Assessing Real-World Effectiveness
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness
Satyam Kumar
Sai Srujana Buddi
U. Sarawgi
Vineet Garg
Shivesh Ranjan
Ognjen
Rudovic
Ahmed Hussen Abdelaziz
Saurabh N. Adya
86
2
0
12 Jun 2024
The RoyalFlush Automatic Speech Diarization and Recognition System for
  In-Car Multi-Channel Automatic Speech Recognition Challenge
The RoyalFlush Automatic Speech Diarization and Recognition System for In-Car Multi-Channel Automatic Speech Recognition Challenge
Jingguang Tian
Shuaishuai Ye
Shunfei Chen
Yang Xiang
Zhaohui Yin
Xinhui Hu
Xinkang Xu
57
0
0
09 May 2024
Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting
  Applications
Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting Applications
Can Cui
Imran Ahmad Sheikh
Mostafa Sadeghi
Emmanuel Vincent
91
3
0
11 Mar 2024
Online speaker diarization of meetings guided by speech separation
Online speaker diarization of meetings guided by speech separation
Elio Gruttadauria
Mathieu Fontaine
S. Essid
30
5
0
30 Jan 2024
Continuous Target Speech Extraction: Enhancing Personalized Diarization
  and Extraction on Complex Recordings
Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings
He Zhao
Hangting Chen
Jianwei Yu
Yuehai Wang
78
1
0
29 Jan 2024
EEND-M2F: Masked-attention mask transformers for speaker diarization
EEND-M2F: Masked-attention mask transformers for speaker diarization
Marc Härkönen
Samuel J. Broughton
Lahiru Samarakoon
106
9
0
23 Jan 2024
DiarizationLM: Speaker Diarization Post-Processing with Large Language
  Models
DiarizationLM: Speaker Diarization Post-Processing with Large Language Models
Quan Wang
Yiling Huang
Guanlong Zhao
Evan Clark
Wei Xia
Hank Liao
AuLLM
162
8
0
07 Jan 2024
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech
  Recognition Challenge
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge
He Wang
Pengcheng Guo
Yue Li
Aoting Zhang
Jiayao Sun
...
Zhuo Chen
Jian Wu
Longbiao Wang
Chng Eng Siong
Sun Li
77
6
0
07 Jan 2024
Self-supervised Pretraining for Robust Personalized Voice Activity
  Detection in Adverse Conditions
Self-supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse Conditions
H. S. Bovbjerg
Jesper Jensen
Jan Østergaard
Zheng-Hua Tan
VLM
64
3
0
27 Dec 2023
The NUS-HLT System for ICASSP2024 ICMC-ASR Grand Challenge
The NUS-HLT System for ICASSP2024 ICMC-ASR Grand Challenge
Meng Ge
Yizhou Peng
Yidi Jiang
Jingru Lin
Junyi Ao
Mehmet Sinan Yildirim
Shuai Wang
Haizhou Li
Mengling Feng
49
0
0
26 Dec 2023
EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed
  Speaker Embeddings
EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings
Sung Hwan Mun
Mingrui Han
Canyeong Moon
Nam Soo Kim
74
1
0
11 Dec 2023
DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors
DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors
Federico Landini
Mireia Díez
Themos Stafylakis
Lukávs Burget
76
14
0
07 Dec 2023
Spatial Diarization for Meeting Transcription with Ad-Hoc Acoustic
  Sensor Networks
Spatial Diarization for Meeting Transcription with Ad-Hoc Acoustic Sensor Networks
Tobias Gburrek
Joerg Schmalenstroeer
Reinhold Haeb-Umbach
62
2
0
27 Nov 2023
Multi-channel Conversational Speaker Separation via Neural Diarization
Multi-channel Conversational Speaker Separation via Neural Diarization
H. Taherian
DeLiang Wang
BDL
74
17
0
15 Nov 2023
End-to-end Online Speaker Diarization with Target Speaker Tracking
End-to-end Online Speaker Diarization with Target Speaker Tracking
Weiqing Wang
Ming Li
69
5
0
12 Oct 2023
Meeting Recognition with Continuous Speech Separation and
  Transcription-Supported Diarization
Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization
Thilo von Neumann
Christoph Boeddeker
Tobias Cord-Landwehr
Marc Delcroix
Reinhold Haeb-Umbach
95
8
0
28 Sep 2023
PP-MeT: a Real-world Personalized Prompt based Meeting Transcription
  System
PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Xiang Lyu
Yuhang Cao
Qing Wang
Jingjing Yin
Yuguang Yang
Pengpeng Zou
G. Zachmann
Heng Lu
VLM
55
3
0
28 Sep 2023
The second multi-channel multi-party meeting transcription challenge
  (M2MeT) 2.0): A benchmark for speaker-attributed ASR
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Yuhao Liang
Mohan Shi
Fan Yu
Yangze Li
Shiliang Zhang
...
Jian Wu
Zhuo Chen
Kong Aik Lee
Zhijie Yan
Hui Bu
67
5
0
24 Sep 2023
NTT speaker diarization system for CHiME-7: multi-domain,
  multi-microphone End-to-end and vector clustering diarization
NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Naohiro Tawara
Marc Delcroix
Atsushi Ando
A. Ogawa
79
11
0
22 Sep 2023
Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Dongmei Wang
Xiong Xiao
Naoyuki Kanda
Midia Yousefi
Takuya Yoshioka
Jian Wu
64
4
0
21 Sep 2023
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding
  with Sequence-to-Sequence Architecture
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Gaobin Yang
Maokui He
Shutong Niu
Ruoyu Wang
Yanyan Yue
Shuangqing Qian
Shilong Wu
Jun Du
Chin-Hui Lee
91
12
0
17 Sep 2023
Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary
  Network
Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Yiling Huang
Weiran Wang
Guanlong Zhao
Hank Liao
Wei Xia
Quan Wang
62
4
0
15 Sep 2023
Attention-based Encoder-Decoder End-to-End Neural Diarization with
  Embedding Enhancer
Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding Enhancer
Zhengyang Chen
Bing Han
Shuai Wang
Yan-min Qian
75
18
0
13 Sep 2023
Enhancing Speaker Diarization with Large Language Models: A Contextual
  Beam Search Approach
Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach
T. Park
Kunal Dhawan
Nithin Rao Koluguri
Jagadeesh Balam
93
17
0
11 Sep 2023
123
Next