Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1906.07839
Cited By
The Second DIHARD Diarization Challenge: Dataset, task, and baselines
Interspeech (Interspeech), 2019
18 June 2019
Neville Ryant
Kenneth Church
C. Cieri
Alejandrina Cristià
Jun Du
Sriram Ganapathy
M. Liberman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The Second DIHARD Diarization Challenge: Dataset, task, and baselines"
50 / 93 papers shown
Automated Analysis of Naturalistic Recordings in Early Childhood: Applications, Challenges, and Opportunities
Jialu Li
Marvin Lavechin
Xulin Fan
Nancy L. McElwain
Alejandrina Cristià
Paola Garcia-Perera
Mark Hasegawa-Johnson
129
2
0
22 Sep 2025
Can We Really Repurpose Multi-Speaker ASR Corpus for Speaker Diarization?
Shota Horiguchi
Naohiro Tawara
Takanori Ashihara
Atsushi Ando
Marc Delcroix
210
1
0
12 Jul 2025
StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Yichen He
Yuan Lin
Jianchao Wu
Hanchong Zhang
Yuchen Zhang
Ruicheng Le
VGen
VLM
874
6
0
11 Nov 2024
Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization
Mao-Kui He
Jun Du
Shu-Tong Niu
Qing-Feng Liu
Chin-Hui Lee
232
2
0
15 Oct 2024
LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor Extraction
IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024
Di Liang
Xiaofei Li
400
2
0
09 Oct 2024
Unified Audio Event Detection
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Yidi Jiang
Ruijie Tao
Wen Huang
Qian Chen
Wen Wang
271
5
0
13 Sep 2024
The VoxCeleb Speaker Recognition Challenge: A Retrospective
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024
Jaesung Huh
Joon Son Chung
Arsha Nagrani
A. Brown
Jee-weon Jung
Daniel Garcia-Romero
Andrew Zisserman
305
18
0
27 Aug 2024
Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization
Luyao Cheng
Hui Wang
Siqi Zheng
Yafeng Chen
Rongjie Huang
Qinglin Zhang
Qian Chen
Xihao Li
263
6
0
22 Aug 2024
A Review of Common Online Speaker Diarization Methods
Roman Aperdannier
Sigurd Schacht
Alexander Piazza
270
0
0
20 Jun 2024
Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
Lin Zhang
Xin Wang
Erica Cooper
Mireia Díez
Federico Landini
Nicholas W. D. Evans
Junichi Yamagishi
234
3
0
12 Jun 2024
InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender Segmentation
International Conference on Language Resources and Evaluation (LREC), 2024
D. Doukhan
Christine Maertens
William Le Personnic
Ludovic Speroni
Reda Dehak
256
2
0
06 Jun 2024
A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification
Rémi Uro
D. Doukhan
Albert Rilliard
Laëtitia Larcher
Anissa-Claire Adgharouamane
Marie Tahon
Antoine Laurent
249
5
0
26 Apr 2024
Spatial Diarization for Meeting Transcription with Ad-Hoc Acoustic Sensor Networks
Asilomar Conference on Signals, Systems and Computers (ACSSC), 2023
Tobias Gburrek
Joerg Schmalenstroeer
Reinhold Haeb-Umbach
207
3
0
27 Nov 2023
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Somil Jain
Pratik Roy Chowdhuri
Prachi Singh
Deepu Vijayasenan
Sriram Ganapathy
257
11
0
21 Nov 2023
Detecting agreement in multi-party dialogue: evaluating speaker diarisation versus a procedural baseline to enhance user engagement
Angus Addlesee
Daniel Denley
Andy Edmondson
Nancie Gunson
Daniel Hernández García
...
James Ndubuisi
Neil O'Reilly
Lia Perochaud
Raphael Valeri
M. Worika
186
4
0
06 Nov 2023
Discriminative Training of VBx Diarization
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Dominik Klement
Mireia Díez
Federico Landini
Lukávs Burget
Anna Silnova
Marc Delcroix
Naohiro Tawara
480
5
0
04 Oct 2023
Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding Enhancer
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Zhengyang Chen
Bing Han
Shuai Wang
Yan-min Qian
272
30
0
13 Sep 2023
Large-Scale Learning on Overlapped Speech Detection: New Benchmark and New General System
Zhao-Yu Yin
Jingguang Tian
Xinhui Hu
Xinkang Xu
Yang Xiang
280
2
0
11 Aug 2023
Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains
Martin Lebourdais
Théo Mariotte
Marie Tahon
Anthony Larcher
Antoine Laurent
Silvio Montrésor
S. Meignier
Jean-Hugh Thomas
VLM
187
6
0
24 Jul 2023
Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Interspeech (Interspeech), 2023
Marc Delcroix
Naohiro Tawara
Mireia Díez
Federico Landini
Anna Silnova
A. Ogawa
Tomohiro Nakatani
L. Burget
S. Araki
206
7
0
23 May 2023
Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding
Juan Pablo Zuluaga
Iuliia Nigmatulina
Amrutha Prasad
P. Motlícek
Driss Khalil
...
Allan Tart
Igor Szöke
Vincent Lenders
M. Rigault
K. Choukri
332
4
0
02 May 2023
Neural Diarization with Non-autoregressive Intermediate Attractors
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yusuke Fujita
Tatsuya Komatsu
Robin Scheibler
Yusuke Kida
Tetsuji Ogawa
271
14
0
13 Mar 2023
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Christoph Boeddeker
Aswin Shanmugam Subramanian
Gordon Wichern
Reinhold Haeb-Umbach
Jonathan Le Roux
353
34
0
07 Mar 2023
DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments
Interspeech (Interspeech), 2023
Shikha Baghel
Shreyas Ramoji
Sidharth Sidharth
Ranjana H
Prachi Singh
...
Pratik Roy Chowdhuri
Kaustubh Kulkarni
Swapnil Padhi
Deepu Vijayasenan
Sriram Ganapathy
353
9
0
01 Mar 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
284
31
0
20 Feb 2023
Probabilistic Back-ends for Online Speaker Recognition and Clustering
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
A. Sholokhov
Nikita Kuzmin
Kong Aik Lee
Chng Eng Siong
148
2
0
19 Feb 2023
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zhihao Du
Shiliang Zhang
Siqi Zheng
Zhijie Yan
137
20
0
18 Nov 2022
Absolute decision corrupts absolutely: conservative online speaker diarisation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Youngki Kwon
Hee-Soo Heo
Bong-Jin Lee
You Jin Kim
Jee-weon Jung
181
6
0
09 Nov 2022
BER: Balanced Error Rate For Speaker Diarization
Tao Liu
K. Yu
149
4
0
08 Nov 2022
No-audio speaking status detection in crowded settings via visual pose-based filtering and wearable acceleration
Jose Vargas-Quiros
Laura Cabrera-Quiros
Hayley Hung
331
3
0
01 Nov 2022
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Marie Kunesova
Zbynek Zajíc
SSL
VLM
158
21
0
26 Oct 2022
The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Gaofeng Cheng
Yifan Chen
Runyan Yang
Qingxu Li
Zehui Yang
...
Qingqing Zhang
Linfu Xie
Y. Qian
Kong Aik Lee
Yonghong Yan
185
9
0
17 Aug 2022
Robust Acoustic Domain Identification with its Application to Speaker Diarization
International Journal of Speech Technology (IJST), 2022
Kishore Kumar A
Shefali Waldekar
Md. Sahidullah
G. Saha
236
0
0
05 Aug 2022
Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Shota Horiguchi
Shinji Watanabe
Leibny Paola García-Perera
Yuki Takashima
Yohei Kawaguchi
312
30
0
06 Jun 2022
Baselines and Protocols for Household Speaker Recognition
The Speaker and Language Recognition Workshop (Odyssey), 2022
A. Sholokhov
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
271
4
0
30 Apr 2022
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021
Yu-Huai Peng
Hung-Shin Lee
Pin-Tuan Huang
Hsin-Min Wang
124
0
0
30 Mar 2022
Multi-target Extractor and Detector for Unknown-number Speaker Diarization
IEEE Signal Processing Letters (SPL), 2022
Chin-Yi Cheng
Hung-Shin Lee
Yu Tsao
Hsin-Min Wang
275
12
0
30 Mar 2022
Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios
Zhihao Du
Shiliang Zhang
Siqi Zheng
Zhijie Yan
190
2
0
18 Mar 2022
Magnitude-aware Probabilistic Speaker Embeddings
The Speaker and Language Recognition Workshop (Odyssey), 2022
Nikita Kuzmin
Igor Fedorov
A. Sholokhov
306
7
0
28 Feb 2022
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Fan Yu
Shiliang Zhang
Pengcheng Guo
Yihui Fu
Zhihao Du
...
Kong Aik Lee
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
234
28
0
08 Feb 2022
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge
A. Brown
Jaesung Huh
Joon Son Chung
Arsha Nagrani
Daniel Garcia-Romero
Andrew Zisserman
237
46
0
12 Jan 2022
Shennong: a Python toolbox for audio speech features extraction
Mathieu Bernard
Maxime Poli
Julien Karadayi
Emmanuel Dupoux
249
9
0
10 Dec 2021
X-Vector based voice activity detection for multi-genre broadcast speech-to-text
Misa Ogura
Matt Haynes
235
1
0
09 Dec 2021
AVA-AVD: Audio-Visual Speaker Diarization in the Wild
ACM Multimedia (MM), 2021
Eric Z. Xu
Zeyang Song
Satoshi Tsutsui
C. Feng
Mang Ye
Mike Zheng Shou
VGen
545
59
0
29 Nov 2021
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information
Zhihao Du
Shiliang Zhang
Siqi Zheng
Weilong Huang
Ming Lei
BDL
260
2
0
28 Nov 2021
Auxiliary Loss of Transformer with Residual Connection for End-to-End Speaker Diarization
Yechan Yu
Dongkeon Park
Hyeongju Kim
240
24
0
14 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
1.2K
1,646
0
13 Oct 2021
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications
Spoken Language Technology Workshop (SLT), 2021
Juan Pablo Zuluaga
Seyyed Saeed Sarfjoo
Amrutha Prasad
Iuliia Nigmatulina
P. Motlícek
Karel Ondrej
Oliver Ohneiser
H. Helmke
406
25
0
12 Oct 2021
Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity
You Jin Kim
Hee-Soo Heo
Jee-weon Jung
Youngki Kwon
Bong-Jin Lee
Joon Son Chung
293
3
0
07 Oct 2021
Multi-scale speaker embedding-based graph attention networks for speaker diarisation
Youngki Kwon
Hee-Soo Heo
Jee-weon Jung
You Jin Kim
Bong-Jin Lee
Joon Son Chung
280
20
0
07 Oct 2021
1
2
Next
Page 1 of 2