Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2001.11482
Cited By
v1
v2
v3 (latest)
Continuous speech separation: dataset and analysis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
30 January 2020
Zhuo Chen
Takuya Yoshioka
Liang Lu
Tianyan Zhou
Zhong Meng
Yi Luo
Jian Wu
Xiong Xiao
Jinyu Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Continuous speech separation: dataset and analysis"
50 / 137 papers shown
LOTUSDIS: A Thai far-field meeting corpus for robust conversational ASR
Pattara Tipaksorn
Sumonmas Thatphithakkul
Vataya Chunwijitra
Kwanchiva Thangthai
96
0
0
23 Sep 2025
TF-CorrNet: Leveraging Spatial Correlation for Continuous Speech Separation
IEEE Signal Processing Letters (IEEE SPL), 2025
Ui-Hyeop Shin
Bon Hyeok Ku
Hyung-Min Park
148
4
0
20 Sep 2025
From Independence to Interaction: Speaker-Aware Simulation of Multi-Speaker Conversational Timing
Máté Gedeon
Péter Mihajlik
152
2
0
19 Sep 2025
Error Analysis in a Modular Meeting Transcription System
Peter Vieting
Simon Berger
Thilo von Neumann
Christoph Boeddeker
Ralf Schluter
Reinhold Haeb-Umbach
222
0
0
12 Sep 2025
Unifying Diarization, Separation, and ASR with Multi-Speaker Encoder
Muhammad Shakeel
Yui Sudo
Yifan Peng
Chyi-Jiunn Lin
Shinji Watanabe
156
6
0
28 Aug 2025
Towards Low-Latency Tracking of Multiple Speakers With Short-Context Speaker Embeddings
Taous Iatariene
Alexandre Guérin
Romain Serizel
178
0
0
18 Aug 2025
Advances in Speech Separation: Techniques, Challenges, and Future Trends
Kai Li
Guo Chen
Wendi Sang
Yi Luo
Zhuo Chen
...
Shulin He
Zhong-Qiu Wang
Andong Li
Z. Wu
Xiaolin Hu
AI4TS
222
7
0
14 Aug 2025
Spatio-spectral diarization of meetings by combining TDOA-based segmentation and speaker embedding-based clustering
Tobias Cord-Landwehr
Tobias Gburrek
Marc Deegen
Reinhold Haeb-Umbach
269
3
0
19 Jun 2025
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Yuzhu Wang
Archontis Politis
Konstantinos Drossos
Maria Sandsten
212
1
0
22 May 2025
The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition
Ming Gao
Shilong Wu
Hang Chen
Jun Du
Chin-Hui Lee
Shinji Watanabe
Jingdong Chen
Siniscalchi Sabato Marco
O. Scharenborg
366
7
0
20 May 2025
Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio
Xinlu He
Jacob Whitehill
354
4
0
16 May 2025
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Junyi Peng
Takanori Ashihara
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
ELM
327
3
0
10 May 2025
Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition
Yufeng Yang
H. Taherian
Vahid Ahmadi Kalkhorani
DeLiang Wang
232
0
0
23 Mar 2025
Summary of the NOTSOFAR-1 Challenge: Highlights and Learnings
Computer Speech and Language (CSL), 2025
Igor Abramovski
Alon Vinnikov
Shalev Shaer
Naoyuki Kanda
Xiaofei Wang
Amir Ivry
Eyal Krupka
373
5
0
28 Jan 2025
USED: Universal Speaker Extraction and Diarization
IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
471
16
0
17 Jan 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
IEEE Signal Processing Magazine (IEEE Signal Process. Mag.), 2024
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
285
6
0
13 Jan 2025
DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition
Alexander Polok
Dominik Klement
M. Kocour
Jiangyu Han
Federico Landini
Bolaji Yusuf
Sanjeev Khudanpur
Sanjeev Khudanpur
J. Černocký
L. Burget
311
0
0
03 Jan 2025
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Thai-Binh Nguyen
Alexander Waibel
299
4
0
27 Nov 2024
StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Yichen He
Yuan Lin
Jianchao Wu
Hanchong Zhang
Yuchen Zhang
Ruicheng Le
VGen
VLM
874
6
0
11 Nov 2024
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
International Conference on Learning Representations (ICLR), 2024
Kai Li
Wendi Sang
Chang Zeng
Runxuan Yang
Guo Chen
Xiaolin Hu
360
9
0
02 Oct 2024
Improving curriculum learning for target speaker extraction with synthetic speakers
Spoken Language Technology Workshop (SLT), 2024
Yun Liu
Xuechen Liu
Junichi Yamagishi
206
1
0
01 Oct 2024
A Framework for Synthetic Audio Conversations Generation using Large Language Models
Kaung Myat Kyaw
Jonathan Hoyin Chan
SyDa
350
3
0
02 Sep 2024
RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
ACM Multimedia (MM), 2024
Tianrui Pan
Jie Liu
Bohan Wang
Jie Tang
Gangshan Wu
327
6
0
27 Jul 2024
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization
Bing Yang
Changsheng Quan
Yabo Wang
Pengyu Wang
Yujie Yang
Ying Fang
Nian Shao
Hui Bu
Xin Xu
Xiaofei Li
257
22
0
28 Jun 2024
A Review of Common Online Speaker Diarization Methods
Roman Aperdannier
Sigurd Schacht
Alexander Piazza
267
0
0
20 Jun 2024
Can Large Language Models Understand Spatial Audio?
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
...
Jun Zhang
Lu Lu
Zejun Ma
Yuxuan Wang
Chao Zhang
426
20
0
12 Jun 2024
Cross-Talk Reduction
Zhong-Qiu Wang
Anurag Kumar
Shinji Watanabe
214
5
0
30 May 2024
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation
Ye Bai
Chenxing Li
Hao Li
Yuanyuan Zhao
Xiaorui Wang
293
2
0
17 Apr 2024
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
International Conference on Learning Representations (ICLR), 2024
Muhammad A. Shah
David Solans Noguero
Mikko A. Heikkilä
Nicolas Kourtellis
325
17
0
08 Mar 2024
A cross-talk robust multichannel VAD model for multiparty agent interactions trained using synthetic re-recordings
Hyewon Han
Naveen Kumar
173
2
0
15 Feb 2024
Online speaker diarization of meetings guided by speech separation
Elio Gruttadauria
Mathieu Fontaine
S. Essid
254
8
0
30 Jan 2024
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription
Interspeech (Interspeech), 2024
Alon Vinnikov
Amir Ivry
Aviv Hurvitz
Igor Abramovski
S. Koubi
...
S. Sivasankaran
Yifan Gong
Min Tang
Huaming Wang
Eyal Krupka
270
51
0
16 Jan 2024
Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Bing Yang
Xiaofei Li
SSL
382
5
0
01 Dec 2023
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Somil Jain
Pratik Roy Chowdhuri
Prachi Singh
Deepu Vijayasenan
Sriram Ganapathy
257
11
0
21 Nov 2023
Multi-channel Conversational Speaker Separation via Neural Diarization
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
H. Taherian
DeLiang Wang
BDL
258
25
0
15 Nov 2023
Real-time Speech Enhancement and Separation with a Unified Deep Neural Network for Single/Dual Talker Scenarios
Kashyap Patel
A. Kovalyov
Issa Panahi
262
1
0
16 Oct 2023
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction
IEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2023
Xiang Hao
Jibin Wu
Jianwei Yu
Chenglin Xu
Kay Chen Tan
421
19
0
11 Oct 2023
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Automatic Speech Recognition & Understanding (ASRU), 2023
Yangze Li
Fan Yu
Yuhao Liang
Pengcheng Guo
Mohan Shi
Zhihao Du
Shiliang Zhang
Lei Xie
229
5
0
07 Oct 2023
One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Samuele Cornell
Jee-weon Jung
Shinji Watanabe
S. Squartini
VLM
290
34
0
02 Oct 2023
Toward Universal Speech Enhancement for Diverse Input Conditions
Automatic Speech Recognition & Understanding (ASRU), 2023
Wangyou Zhang
Kohei Saijo
Zhong-Qiu Wang
Shinji Watanabe
Yanmin Qian
VLM
249
44
0
29 Sep 2023
Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization
Thilo von Neumann
Christoph Boeddeker
Tobias Cord-Landwehr
Marc Delcroix
Reinhold Haeb-Umbach
328
14
0
28 Sep 2023
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Automatic Speech Recognition & Understanding (ASRU), 2023
Yuhao Liang
Mohan Shi
Fan Yu
Yangze Li
Shiliang Zhang
...
Jian Wu
Zhuo Chen
Kong Aik Lee
Zhijie Yan
Hui Bu
305
9
0
24 Sep 2023
t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jian Wu
Naoyuki Kanda
Takuya Yoshioka
Rui Zhao
Zhuo Chen
Jinyu Li
209
6
0
15 Sep 2023
Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting Transcription
Spoken Language Technology Workshop (SLT), 2023
Peter Vieting
Simon Berger
Thilo von Neumann
Christoph Boeddeker
Ralf Schluter
Reinhold Haeb-Umbach
311
0
0
15 Sep 2023
Convoifilter: A case study of doing cocktail party speech recognition
Thai-Binh Nguyen
A. Waibel
299
3
0
22 Aug 2023
LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices
Joerg Schmalenstroeer
Tobias Gburrek
Reinhold Haeb-Umbach
183
4
0
21 Aug 2023
SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Changsheng Quan
Xiaofei Li
222
94
0
31 Jul 2023
MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems
Thilo von Neumann
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
409
44
0
21 Jul 2023
Cascaded encoders for fine-tuning ASR models on overlapped speech
Interspeech (Interspeech), 2023
R. Rose
Oscar Chang
Olivier Siohan
173
2
0
28 Jun 2023
The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Samuele Cornell
Sanjeev Khudanpur
Shinji Watanabe
Desh Raj
Xuankai Chang
...
Matthew Maciejewski
Yoshiki Masuyama
Zhong-Qiu Wang
S. Squartini
Sanjeev Khudanpur
282
83
0
23 Jun 2023
1
2
3
Next
Page 1 of 3