v1v2v3 (latest)

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors

20 May 2020

Papers citing "End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors"

50 / 133 papers shown

Probabilistic Fusion and Calibration of Neural Speaker Diarization Models

Juan Ignacio Alvarez-Trejos

247

27 Nov 2025

From Independence to Interaction: Speaker-Aware Simulation of Multi-Speaker Conversational Timing

Máté Gedeon

Péter Mihajlik

113

19 Sep 2025

Pushing the Limits of End-to-End Diarization

Samuel J. Broughton

Lahiru Samarakoon

154

18 Sep 2025

Character-Centric Understanding of Animated Movies

174

15 Sep 2025

Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering

271

24 Jul 2025

From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding

...

168

03 Jul 2025

Exploring Speaker Diarization with Mixture of Experts

197

17 Jun 2025

SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition

Yuta Hirano

Sakriani Sakti

149

15 Jun 2025

Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers

196

22 May 2025

Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning

Abdulhady Abas Abdullah

972

23 Apr 2025

Demographic Attributes Prediction from Speech Using WavLM EmbeddingsAnnual Conference on Information Sciences and Systems (CISS), 2025

Yuchen Yang

Thomas Thebaud

Najim Dehak

268

17 Feb 2025

SCDiar: a streaming diarization system based on speaker change detection and speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

184

28 Jan 2025

USED: Universal Speaker Extraction and DiarizationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023

Junyi Ao

Mehmet Sinan Yildirim

440

17 Jan 2025

Multiple Choice Learning for Efficient Speech Separation with Many Speakers

371

27 Nov 2024

Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization

215

04 Nov 2024

LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor ExtractionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024

Di Liang

Xiaofei Li

383

09 Oct 2024

Incorporating Spatial Cues in Modular Speaker Diarization for Multi-channel Multi-party MeetingsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Jia Pan

319

25 Sep 2024

FruitsMusic: A Real-World Corpus of Japanese Idol-Group SongsInternational Society for Music Information Retrieval Conference (ISMIR), 2024

Satoru Fukayama

244

19 Sep 2024

Leveraging Self-Supervised Learning for Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Lukas Burget

362

14 Sep 2024

Unified Audio Event DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Yidi Jiang

Ruijie Tao

Wen Huang

Qian Chen

Wen Wang

248

13 Sep 2024

Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems

333

10 Sep 2024

Focus Agent: LLM-Powered Virtual Focus GroupInternational Conference on Intelligent Virtual Agents (IVA), 2024

232

03 Sep 2024

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASRSpoken Language Technology Workshop (SLT), 2024

Weiqing Wang

Kunal Dhawan

Taejin Park

Jagadeesh Balam

Boris Ginsburg

254

02 Sep 2024

The VoxCeleb Speaker Recognition Challenge: A RetrospectiveIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024

286

27 Aug 2024

Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization

Luyao Cheng

Hui Wang

Siqi Zheng

Yafeng Chen

Rongjie Huang

Qinglin Zhang

Qian Chen

Xihao Li

244

22 Aug 2024

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

Shuai Wang

Zheng-Shou Chen

Kong Aik Lee

Yan-min Qian

Haizhou Li

369

21 Jul 2024

Leveraging Speaker Embeddings in End-to-End Neural Diarization for Two-Speaker Scenarios

Juan Ignacio Alvarez-Trejos

Beltrán Labrador

Alicia Lozano-Diez

366

01 Jul 2024

Speakers Unembedded: Embedding-free Approach to Long-form Neural Diarization

189

26 Jun 2024

Investigating Confidence Estimation Measures for Speaker Diarization

238

24 Jun 2024

Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework

Hokuto Munakata

Ryo Terashima

Yusuke Fujita

229

24 Jun 2024

Speech-Aware Neural Diarization with Encoder-Decoder Attractor Guided by Attention Constraints

PeiYing Lee

HauYun Guo

Berlin Chen

211

21 Mar 2024

Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex RecordingsIEEE International Joint Conference on Neural Network (IJCNN), 2024

219

29 Jan 2024

EEND-M2F: Masked-attention mask transformers for speaker diarizationInterspeech (Interspeech), 2024

Marc Härkönen

Samuel J. Broughton

Lahiru Samarakoon

360

23 Jan 2024

Boosting Unknown-number Speaker Separation with Transformer Decoder-based AttractorIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Shinji Watanabe

185

23 Jan 2024

Look, Listen and Recognise: Character-Aware Audio-Visual SubtitlingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Bruno Korbar

Jaesung Huh

Andrew Zisserman

257

22 Jan 2024

DiarizationLM: Speaker Diarization Post-Processing with Large Language Models

710

07 Jan 2024

EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings

268

11 Dec 2023

DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors

443

07 Dec 2023

Powerset multi-class cross entropy loss for neural speaker diarizationInterspeech (Interspeech), 2023

Alexis Plaquet

H. Bredin

372

191

19 Oct 2023

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation

T. Park

He Huang

Coleman Hooper

Nithin Rao Koluguri

Kunal Dhawan

Ante Jukić

Jagadeesh Balam

Boris Ginsburg

206

18 Oct 2023

End-to-end Online Speaker Diarization with Target Speaker Tracking

Weiqing Wang

Ming Li

356

12 Oct 2023

SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASRAutomatic Speech Recognition & Understanding (ASRU), 2023

Lei Xie

219

07 Oct 2023

Discriminative Training of VBx DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Dominik Klement

442

04 Oct 2023

Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Di Liang

Nian Shao

Xiaofei Li

209

25 Sep 2023

NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Naohiro Tawara

Marc Delcroix

Atsushi Ando

A. Ogawa

220

22 Sep 2023

Profile-Error-Tolerant Target-Speaker Voice Activity DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

270

21 Sep 2023

Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network

331

15 Sep 2023

DiaCorrect: Error Correction Back-end For Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jiangyu Han

Heng Lu

221

15 Sep 2023

Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding EnhancerIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Zhengyang Chen

Bing Han

Shuai Wang

Yan-min Qian

255

13 Sep 2023

Speaker Diarization of Scripted Audiovisual Content

194

04 Aug 2023