Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.11482
Cited By
Continuous speech separation: dataset and analysis
30 January 2020
Zhuo Chen
Takuya Yoshioka
Liang Lu
Tianyan Zhou
Zhong Meng
Yi Luo
Jian Wu
Xiong Xiao
Jinyu Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Continuous speech separation: dataset and analysis"
50 / 128 papers shown
Title
The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition
Ming Gao
Shilong Wu
Hang Chen
Jun Du
Chin-Hui Lee
Shinji Watanabe
Jingdong Chen
Siniscalchi Sabato Marco
O. Scharenborg
7
0
0
20 May 2025
Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio
Xinlu He
Jacob Whitehill
19
0
0
16 May 2025
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
Junyi Peng
Takanori Ashihara
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
ELM
31
0
0
10 May 2025
Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition
Yufeng Yang
H. Taherian
Vahid Ahmadi Kalkhorani
DeLiang Wang
44
0
0
23 Mar 2025
Summary of the NOTSOFAR-1 Challenge: Highlights and Learnings
Igor Abramovski
Alon Vinnikov
Shalev Shaer
Naoyuki Kanda
Xiaofei Wang
Amir Ivry
Eyal Krupka
41
0
0
28 Jan 2025
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
43
6
0
17 Jan 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
43
0
0
13 Jan 2025
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
Thai-Binh Nguyen
Alexander Waibel
82
1
0
27 Nov 2024
StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Yichen He
Yuan Lin
Jianchao Wu
Hanchong Zhang
Yuchen Zhang
Ruicheng Le
VGen
VLM
198
2
0
11 Nov 2024
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Kai Li
Wendi Sang
Chang Zeng
Runxuan Yang
Guo Chen
Xiaolin Hu
39
2
0
02 Oct 2024
Improving curriculum learning for target speaker extraction with synthetic speakers
Yun Liu
Xuechen Liu
Junichi Yamagishi
23
0
0
01 Oct 2024
A Framework for Synthetic Audio Conversations Generation using Large Language Models
Kaung Myat Kyaw
Jonathan Hoyin Chan
SyDa
42
2
0
02 Sep 2024
RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Tianrui Pan
Jie Liu
Bohan Wang
Jie Tang
Gangshan Wu
40
2
0
27 Jul 2024
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization
Bing Yang
Changsheng Quan
Yabo Wang
Pengyu Wang
Yujie Yang
Ying Fang
Nian Shao
Hui Bu
Xin Xu
Xiaofei Li
43
5
0
28 Jun 2024
A Review of Common Online Speaker Diarization Methods
Roman Aperdannier
Sigurd Schacht
Alexander Piazza
35
0
0
20 Jun 2024
Can Large Language Models Understand Spatial Audio?
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
...
Jun Zhang
Lu Lu
Zejun Ma
Yuxuan Wang
Chao Zhang
49
4
0
12 Jun 2024
Cross-Talk Reduction
Zhong-Qiu Wang
Anurag Kumar
Shinji Watanabe
34
2
0
30 May 2024
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation
Ye Bai
Chenxing Li
Hao Li
Yuanyuan Zhao
Xiaorui Wang
24
0
0
17 Apr 2024
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Muhammad A. Shah
David Solans Noguero
Mikko A. Heikkilä
Nicolas Kourtellis
32
5
0
08 Mar 2024
A cross-talk robust multichannel VAD model for multiparty agent interactions trained using synthetic re-recordings
Hyewon Han
Naveen Kumar
18
1
0
15 Feb 2024
Online speaker diarization of meetings guided by speech separation
Elio Gruttadauria
Mathieu Fontaine
S. Essid
17
4
0
30 Jan 2024
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription
Alon Vinnikov
Amir Ivry
Aviv Hurvitz
Igor Abramovski
S. Koubi
...
S. Sivasankaran
Yifan Gong
Min Tang
Huaming Wang
Eyal Krupka
41
20
0
16 Jan 2024
Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer
Bing Yang
Xiaofei Li
SSL
28
3
0
01 Dec 2023
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Somil Jain
Pratik Roy Chowdhuri
Prachi Singh
Deepu Vijayasenan
Sriram Ganapathy
30
6
0
21 Nov 2023
Multi-channel Conversational Speaker Separation via Neural Diarization
H. Taherian
DeLiang Wang
BDL
42
16
0
15 Nov 2023
Real-time Speech Enhancement and Separation with a Unified Deep Neural Network for Single/Dual Talker Scenarios
Kashyap Patel
A. Kovalyov
Issa Panahi
15
0
0
16 Oct 2023
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction
Xiang Hao
Jibin Wu
Jianwei Yu
Chenglin Xu
Kay Chen Tan
32
10
0
11 Oct 2023
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Yangze Li
Fan Yu
Yuhao Liang
Pengcheng Guo
Mohan Shi
Zhihao Du
Shiliang Zhang
Lei Xie
24
3
0
07 Oct 2023
One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
Samuele Cornell
Jee-weon Jung
Shinji Watanabe
S. Squartini
VLM
32
16
0
02 Oct 2023
Toward Universal Speech Enhancement for Diverse Input Conditions
Wangyou Zhang
Kohei Saijo
Zhong-Qiu Wang
Shinji Watanabe
Yanmin Qian
VLM
32
19
0
29 Sep 2023
Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization
Thilo von Neumann
Christoph Boeddeker
Tobias Cord-Landwehr
Marc Delcroix
Reinhold Haeb-Umbach
25
7
0
28 Sep 2023
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Yuhao Liang
Mohan Shi
Fan Yu
Yangze Li
Shiliang Zhang
...
Jian Wu
Zhuo Chen
Kong Aik Lee
Zhijie Yan
Hui Bu
33
5
0
24 Sep 2023
t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability
Jian Wu
Naoyuki Kanda
Takuya Yoshioka
Rui Zhao
Zhuo Chen
Jinyu Li
21
5
0
15 Sep 2023
Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting Transcription
Peter Vieting
Simon Berger
Thilo von Neumann
Christoph Boeddeker
Ralf Schluter
Reinhold Haeb-Umbach
26
0
0
15 Sep 2023
Convoifilter: A case study of doing cocktail party speech recognition
Thai-Binh Nguyen
A. Waibel
17
2
0
22 Aug 2023
LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices
Joerg Schmalenstroeer
Tobias Gburrek
Reinhold Haeb-Umbach
24
3
0
21 Aug 2023
SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation
Changsheng Quan
Xiaofei Li
18
36
0
31 Jul 2023
MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems
Thilo von Neumann
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
29
16
0
21 Jul 2023
Cascaded encoders for fine-tuning ASR models on overlapped speech
R. Rose
Oscar Chang
Olivier Siohan
29
1
0
28 Jun 2023
The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Samuele Cornell
Sanjeev Khudanpur
Shinji Watanabe
Desh Raj
Xuankai Chang
...
Matthew Maciejewski
Yoshiki Masuyama
Zhong-Qiu Wang
S. Squartini
Sanjeev Khudanpur
32
51
0
23 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
34
9
0
18 Jun 2023
Statistical Beamformer Exploiting Non-stationarity and Sparsity with Spatially Constrained ICA for Robust Speech Recognition
U.H Shin
Hyung-Min Park
15
2
0
13 Jun 2023
UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures
Zhong-Qiu Wang
Shinji Watanabe
38
10
0
31 May 2023
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
Alessio Brutti
S. Squartini
49
9
0
29 May 2023
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR
Yuhao Liang
Fan Yu
Yangze Li
Pengcheng Guo
Shiliang Zhang
Qian Chen
Linfu Xie
33
8
0
23 May 2023
Fast Random Approximation of Multi-channel Room Impulse Response
Yi Luo
Rongzhi Gu
20
4
0
17 Apr 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
33
2
0
22 Mar 2023
End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations
Giovanni Morrone
Samuele Cornell
L. Serafini
Enrico Zovato
Alessio Brutti
S. Squartini
23
4
0
21 Mar 2023
Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network
Cong Han
N. Mesgarani
39
4
0
13 Mar 2023
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings
Christoph Boeddeker
Aswin Shanmugam Subramanian
Gordon Wichern
Reinhold Haeb-Umbach
Jonathan Le Roux
49
23
0
07 Mar 2023
1
2
3
Next