The Second DIHARD Diarization Challenge: Dataset, task, and baselines

Interspeech (Interspeech), 2019

18 June 2019

Sriram Ganapathy

Papers citing "The Second DIHARD Diarization Challenge: Dataset, task, and baselines"

50 / 93 papers shown

Automated Analysis of Naturalistic Recordings in Early Childhood: Applications, Challenges, and Opportunities

Mark Hasegawa-Johnson

129

22 Sep 2025

Can We Really Repurpose Multi-Speaker ASR Corpus for Speaker Diarization?

210

12 Jul 2025

StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification

874

11 Nov 2024

Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization

232

15 Oct 2024

LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor ExtractionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024

Di Liang

Xiaofei Li

400

09 Oct 2024

Unified Audio Event DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Yidi Jiang

Ruijie Tao

Wen Huang

Qian Chen

Wen Wang

271

13 Sep 2024

The VoxCeleb Speaker Recognition Challenge: A RetrospectiveIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024

305

27 Aug 2024

Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization

Luyao Cheng

Hui Wang

Siqi Zheng

Yafeng Chen

Rongjie Huang

Qinglin Zhang

Qian Chen

Xihao Li

263

22 Aug 2024

A Review of Common Online Speaker Diarization Methods

Roman Aperdannier

Sigurd Schacht

Alexander Piazza

270

20 Jun 2024

Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio

Lin Zhang

Xin Wang

234

12 Jun 2024

InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender SegmentationInternational Conference on Language Resources and Evaluation (LREC), 2024

256

06 Jun 2024

A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification

Anissa-Claire Adgharouamane

Marie Tahon

Antoine Laurent

249

26 Apr 2024

Spatial Diarization for Meeting Transcription with Ad-Hoc Acoustic Sensor NetworksAsilomar Conference on Signals, Systems and Computers (ACSSC), 2023

Tobias Gburrek

Joerg Schmalenstroeer

Reinhold Haeb-Umbach

207

27 Nov 2023

Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments

Pratik Roy Chowdhuri

Sriram Ganapathy

257

21 Nov 2023

Detecting agreement in multi-party dialogue: evaluating speaker diarisation versus a procedural baseline to enhance user engagement

Daniel Hernández García

...

186

06 Nov 2023

Discriminative Training of VBx DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Dominik Klement

480

04 Oct 2023

Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding EnhancerIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Zhengyang Chen

Bing Han

Shuai Wang

Yan-min Qian

272

13 Sep 2023

Large-Scale Learning on Overlapped Speech Detection: New Benchmark and New General System

280

11 Aug 2023

Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains

187

24 Jul 2023

Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based DiarizationInterspeech (Interspeech), 2023

206

23 May 2023

Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding

...

332

02 May 2023

Neural Diarization with Non-autoregressive Intermediate AttractorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

271

13 Mar 2023

TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker EmbeddingsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Christoph Boeddeker

Aswin Shanmugam Subramanian

Gordon Wichern

Reinhold Haeb-Umbach

Jonathan Le Roux

353

07 Mar 2023

DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational EnvironmentsInterspeech (Interspeech), 2023

...

Pratik Roy Chowdhuri

Kaustubh Kulkarni

Swapnil Padhi

Deepu Vijayasenan

Sriram Ganapathy

353

01 Mar 2023

VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

284

20 Feb 2023

Probabilistic Back-ends for Online Speaker Recognition and ClusteringIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

A. Sholokhov

Nikita Kuzmin

Kong Aik Lee

Chng Eng Siong

148

19 Feb 2023

Speaker Overlap-aware Neural Diarization for Multi-party Meeting AnalysisConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Zhihao Du

Shiliang Zhang

Siqi Zheng

Zhijie Yan

137

18 Nov 2022

Absolute decision corrupts absolutely: conservative online speaker diarisationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

181

09 Nov 2022

BER: Balanced Error Rate For Speaker Diarization

Tao Liu

K. Yu

149

08 Nov 2022

No-audio speaking status detection in crowded settings via visual pose-based filtering and wearable acceleration

Jose Vargas-Quiros

Laura Cabrera-Quiros

Hayley Hung

331

01 Nov 2022

Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Marie Kunesova

Zbynek Zajíc

SSL VLM

158

26 Oct 2022

The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and BaselinesInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022

...

Kong Aik Lee

185

17 Aug 2022

Robust Acoustic Domain Identification with its Application to Speaker DiarizationInternational Journal of Speech Technology (IJST), 2022

236

05 Aug 2022

Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local AttractorsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Shota Horiguchi

Shinji Watanabe

Leibny Paola García-Perera

Yuki Takashima

Yohei Kawaguchi

312

06 Jun 2022

Baselines and Protocols for Household Speaker RecognitionThe Speaker and Language Recognition Workshop (Odyssey), 2022

A. Sholokhov

Xuechen Liu

Md. Sahidullah

Tomi Kinnunen

271

30 Apr 2022

Generation of Speaker Representations Using Heterogeneous Training Batch AssemblyAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021

Yu-Huai Peng

Hung-Shin Lee

Pin-Tuan Huang

Hsin-Min Wang

124

30 Mar 2022

Multi-target Extractor and Detector for Unknown-number Speaker DiarizationIEEE Signal Processing Letters (SPL), 2022

Chin-Yi Cheng

Hung-Shin Lee

Yu Tsao

Hsin-Min Wang

275

30 Mar 2022

Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios

Zhihao Du

Shiliang Zhang

Siqi Zheng

Zhijie Yan

190

18 Mar 2022

Magnitude-aware Probabilistic Speaker EmbeddingsThe Speaker and Language Recognition Workshop (Odyssey), 2022

Nikita Kuzmin

Igor Fedorov

A. Sholokhov

306

28 Feb 2022

Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand ChallengeIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

...

Kong Aik Lee

Zhijie Yan

B. Ma

Xin Xu

Hui Bu

234

08 Feb 2022

VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge

Joon Son Chung

237

12 Jan 2022

Shennong: a Python toolbox for audio speech features extraction

249

10 Dec 2021

X-Vector based voice activity detection for multi-genre broadcast speech-to-text

Misa Ogura

Matt Haynes

235

09 Dec 2021

AVA-AVD: Audio-Visual Speaker Diarization in the WildACM Multimedia (MM), 2021

545

29 Nov 2021

Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information

Siqi Zheng

260

28 Nov 2021

Auxiliary Loss of Transformer with Residual Connection for End-to-End Speaker Diarization

Yechan Yu

Dongkeon Park

Hyeongju Kim

240

14 Oct 2021

Ego4D: Around the World in 3,000 Hours of Egocentric Video

...

Antonio Torralba

Mingfei Yan

1.2K

1,646

13 Oct 2021

BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control CommunicationsSpoken Language Technology Workshop (SLT), 2021

406

12 Oct 2021

Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity

293

07 Oct 2021

Multi-scale speaker embedding-based graph attention networks for speaker diarisation

280

07 Oct 2021