ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.09921
  4. Cited By
End-to-End Speaker Diarization for an Unknown Number of Speakers with
  Encoder-Decoder Based Attractors
v1v2v3 (latest)

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors

20 May 2020
Shota Horiguchi
Yusuke Fujita
Shinji Watanabe
Yawen Xue
Kenji Nagamatsu
ArXiv (abs)PDFHTML

Papers citing "End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors"

50 / 133 papers shown
Probabilistic Fusion and Calibration of Neural Speaker Diarization Models
Probabilistic Fusion and Calibration of Neural Speaker Diarization Models
Juan Ignacio Alvarez-Trejos
Sérgio A. Balanya
D. Ramos
Alicia Lozano-Diez
UQCV
247
0
0
27 Nov 2025
From Independence to Interaction: Speaker-Aware Simulation of Multi-Speaker Conversational Timing
From Independence to Interaction: Speaker-Aware Simulation of Multi-Speaker Conversational Timing
Máté Gedeon
Péter Mihajlik
113
2
0
19 Sep 2025
Pushing the Limits of End-to-End Diarization
Pushing the Limits of End-to-End Diarization
Samuel J. Broughton
Lahiru Samarakoon
154
2
0
18 Sep 2025
Character-Centric Understanding of Animated Movies
Character-Centric Understanding of Animated Movies
Zhongrui Gui
Junyu Xie
Tengda Han
Weidi Xie
Andrew Zisserman
174
1
0
15 Sep 2025
Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering
Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering
Ivan Medennikov
Taejin Park
Weiqing Wang
He Huang
Kunal Dhawan
Jinhan Wang
Jagadeesh Balam
Boris Ginsburg
271
5
0
24 Jul 2025
From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding
From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding
Xiangfeng Wang
Xiao Li
Yadong Wei
Xueyu Song
Yang Song
...
Fangrui Zeng
Zaiyi Chen
Liu Liu
Gu Xu
Tong Xu
VGen
168
0
0
03 Jul 2025
Exploring Speaker Diarization with Mixture of Experts
Exploring Speaker Diarization with Mixture of Experts
Gaobin Yang
Maokui He
Shutong Niu
Ruoyu Wang
Hang Chen
Jun Du
MoE
197
0
0
17 Jun 2025
SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition
SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition
Yuta Hirano
Sakriani Sakti
149
0
0
15 Jun 2025
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Yuzhu Wang
Archontis Politis
Konstantinos Drossos
Maria Sandsten
196
1
0
22 May 2025
Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning
Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning
Abdulhady Abas Abdullah
S. H. Karim
Sara Azad Ahmed
Kanar R. Tariq
Tarik Ahmed Rashid
972
3
0
23 Apr 2025
Demographic Attributes Prediction from Speech Using WavLM Embeddings
Demographic Attributes Prediction from Speech Using WavLM EmbeddingsAnnual Conference on Information Sciences and Systems (CISS), 2025
Yuchen Yang
Thomas Thebaud
Najim Dehak
268
4
0
17 Feb 2025
SCDiar: a streaming diarization system based on speaker change detection and speech recognition
SCDiar: a streaming diarization system based on speaker change detection and speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Naijun Zheng
Xucheng Wan
Kai Liu
Zhou Huan
184
0
0
28 Jan 2025
USED: Universal Speaker Extraction and Diarization
USED: Universal Speaker Extraction and DiarizationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
440
14
0
17 Jan 2025
Multiple Choice Learning for Efficient Speech Separation with Many
  Speakers
Multiple Choice Learning for Efficient Speech Separation with Many Speakers
David Perera
François Derrida
Théo Mariotte
Gaël Richard
S. Essid
371
3
0
27 Nov 2024
Joint Training of Speaker Embedding Extractor, Speech and Overlap
  Detection for Diarization
Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization
Petr Pálka
Federico Landini
Dominik Klement
Mireia Díez
Anna Silnova
Marc Delcroix
L. Burget
VLM
215
1
0
04 Nov 2024
LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor Extraction
LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor ExtractionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024
Di Liang
Xiaofei Li
383
2
0
09 Oct 2024
Incorporating Spatial Cues in Modular Speaker Diarization for
  Multi-channel Multi-party Meetings
Incorporating Spatial Cues in Modular Speaker Diarization for Multi-channel Multi-party MeetingsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Ruoyu Wang
Shutong Niu
Gaobin Yang
Jun Du
Shuangqing Qian
Tian Gao
Jia Pan
319
5
0
25 Sep 2024
FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs
FruitsMusic: A Real-World Corpus of Japanese Idol-Group SongsInternational Society for Music Information Retrieval Conference (ISMIR), 2024
Hitoshi Suda
Shunsuke Yoshida
Tomohiko Nakamura
Satoru Fukayama
Jun Ogata
244
3
0
19 Sep 2024
Leveraging Self-Supervised Learning for Speaker Diarization
Leveraging Self-Supervised Learning for Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Jiangyu Han
Federico Landini
Johan Rohdin
Anna Silnova
Mireia Díez
Lukas Burget
362
38
0
14 Sep 2024
Unified Audio Event Detection
Unified Audio Event DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Yidi Jiang
Ruijie Tao
Wen Huang
Qian Chen
Wen Wang
248
5
0
13 Sep 2024
Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems
Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems
Taejin Park
Ivan Medennikov
Kunal Dhawan
Weiqing Wang
He Huang
Nithin Rao Koluguri
Krishna Puvvada
Jagadeesh Balam
Boris Ginsburg
333
6
0
10 Sep 2024
Focus Agent: LLM-Powered Virtual Focus Group
Focus Agent: LLM-Powered Virtual Focus GroupInternational Conference on Intelligent Virtual Agents (IVA), 2024
Taiyu Zhang
Xuesong Zhang
Robbe Cools
Adalberto L. Simeone
LLMAG
232
7
0
03 Sep 2024
Resource-Efficient Adaptation of Speech Foundation Models for
  Multi-Speaker ASR
Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASRSpoken Language Technology Workshop (SLT), 2024
Weiqing Wang
Kunal Dhawan
Taejin Park
Krishna Puvvada
Ivan Medennikov
Somshubra Majumdar
He Huang
Jagadeesh Balam
Boris Ginsburg
254
5
0
02 Sep 2024
The VoxCeleb Speaker Recognition Challenge: A Retrospective
The VoxCeleb Speaker Recognition Challenge: A RetrospectiveIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024
Jaesung Huh
Joon Son Chung
Arsha Nagrani
A. Brown
Jee-weon Jung
Daniel Garcia-Romero
Andrew Zisserman
286
18
0
27 Aug 2024
Integrating Audio, Visual, and Semantic Information for Enhanced
  Multimodal Speaker Diarization
Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization
Luyao Cheng
Hui Wang
Siqi Zheng
Yafeng Chen
Rongjie Huang
Qinglin Zhang
Qian Chen
Xihao Li
244
5
0
22 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
369
28
0
21 Jul 2024
Leveraging Speaker Embeddings in End-to-End Neural Diarization for
  Two-Speaker Scenarios
Leveraging Speaker Embeddings in End-to-End Neural Diarization for Two-Speaker Scenarios
Juan Ignacio Alvarez-Trejos
Beltrán Labrador
Alicia Lozano-Diez
366
2
0
01 Jul 2024
Speakers Unembedded: Embedding-free Approach to Long-form Neural
  Diarization
Speakers Unembedded: Embedding-free Approach to Long-form Neural Diarization
Xiang Li
Vivek Govindan
Rohit Paturi
S. Srinivasan
189
1
0
26 Jun 2024
Investigating Confidence Estimation Measures for Speaker Diarization
Investigating Confidence Estimation Measures for Speaker Diarization
Anurag Chowdhury
Abhinav Misra
Mark C. Fuhs
Monika Woszczyna
238
0
0
24 Jun 2024
Song Data Cleansing for End-to-End Neural Singer Diarization Using
  Neural Analysis and Synthesis Framework
Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework
Hokuto Munakata
Ryo Terashima
Yusuke Fujita
229
0
0
24 Jun 2024
Speech-Aware Neural Diarization with Encoder-Decoder Attractor Guided by
  Attention Constraints
Speech-Aware Neural Diarization with Encoder-Decoder Attractor Guided by Attention Constraints
PeiYing Lee
HauYun Guo
Berlin Chen
211
0
0
21 Mar 2024
Continuous Target Speech Extraction: Enhancing Personalized Diarization
  and Extraction on Complex Recordings
Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex RecordingsIEEE International Joint Conference on Neural Network (IJCNN), 2024
He Zhao
Hangting Chen
Jianwei Yu
Yuehai Wang
219
1
0
29 Jan 2024
EEND-M2F: Masked-attention mask transformers for speaker diarization
EEND-M2F: Masked-attention mask transformers for speaker diarizationInterspeech (Interspeech), 2024
Marc Härkönen
Samuel J. Broughton
Lahiru Samarakoon
360
21
0
23 Jan 2024
Boosting Unknown-number Speaker Separation with Transformer
  Decoder-based Attractor
Boosting Unknown-number Speaker Separation with Transformer Decoder-based AttractorIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Younglo Lee
Shukjae Choi
Byeonghak Kim
Zhong-Qiu Wang
Shinji Watanabe
MoE
185
20
0
23 Jan 2024
Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling
Look, Listen and Recognise: Character-Aware Audio-Visual SubtitlingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Bruno Korbar
Jaesung Huh
Andrew Zisserman
257
8
0
22 Jan 2024
DiarizationLM: Speaker Diarization Post-Processing with Large Language
  Models
DiarizationLM: Speaker Diarization Post-Processing with Large Language Models
Quan Wang
Yiling Huang
Guanlong Zhao
Evan Clark
Wei Xia
Hank Liao
AuLLM
710
23
0
07 Jan 2024
EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed
  Speaker Embeddings
EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings
Sung Hwan Mun
Mingrui Han
Canyeong Moon
Nam Soo Kim
268
1
0
11 Dec 2023
DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors
DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors
Federico Landini
Mireia Díez
Themos Stafylakis
Lukávs Burget
443
24
0
07 Dec 2023
Powerset multi-class cross entropy loss for neural speaker diarization
Powerset multi-class cross entropy loss for neural speaker diarizationInterspeech (Interspeech), 2023
Alexis Plaquet
H. Bredin
372
191
0
19 Oct 2023
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling
  Technique for Synthetic Data Generation
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
T. Park
He Huang
Coleman Hooper
Nithin Rao Koluguri
Kunal Dhawan
Ante Jukić
Jagadeesh Balam
Boris Ginsburg
206
11
0
18 Oct 2023
End-to-end Online Speaker Diarization with Target Speaker Tracking
End-to-end Online Speaker Diarization with Target Speaker Tracking
Weiqing Wang
Ming Li
356
8
0
12 Oct 2023
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASRAutomatic Speech Recognition & Understanding (ASRU), 2023
Yangze Li
Fan Yu
Yuhao Liang
Pengcheng Guo
Mohan Shi
Zhihao Du
Shiliang Zhang
Lei Xie
219
5
0
07 Oct 2023
Discriminative Training of VBx Diarization
Discriminative Training of VBx DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Dominik Klement
Mireia Díez
Federico Landini
Lukávs Burget
Anna Silnova
Marc Delcroix
Naohiro Tawara
442
5
0
04 Oct 2023
Frame-wise streaming end-to-end speaker diarization with
  non-autoregressive self-attention-based attractors
Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Di Liang
Nian Shao
Xiaofei Li
209
7
0
25 Sep 2023
NTT speaker diarization system for CHiME-7: multi-domain,
  multi-microphone End-to-end and vector clustering diarization
NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Naohiro Tawara
Marc Delcroix
Atsushi Ando
A. Ogawa
220
14
0
22 Sep 2023
Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Profile-Error-Tolerant Target-Speaker Voice Activity DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Dongmei Wang
Xiong Xiao
Naoyuki Kanda
Midia Yousefi
Takuya Yoshioka
Jian Wu
270
7
0
21 Sep 2023
Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary
  Network
Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Yiling Huang
Weiran Wang
Guanlong Zhao
Hank Liao
Wei Xia
Quan Wang
331
7
0
15 Sep 2023
DiaCorrect: Error Correction Back-end For Speaker Diarization
DiaCorrect: Error Correction Back-end For Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jiangyu Han
Federico Landini
Johan Rohdin
Mireia Díez
Lukás Burget
Yuhang Cao
Heng Lu
J. Černocký
221
5
0
15 Sep 2023
Attention-based Encoder-Decoder End-to-End Neural Diarization with
  Embedding Enhancer
Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding EnhancerIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Zhengyang Chen
Bing Han
Shuai Wang
Yan-min Qian
255
30
0
13 Sep 2023
Speaker Diarization of Scripted Audiovisual Content
Speaker Diarization of Scripted Audiovisual Content
Yogesh Virkar
Brian Thompson
Rohit Paturi
S. Srinivasan
Marcello Federico
194
2
0
04 Aug 2023
123
Next
Page 1 of 3