ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.08933
  4. Cited By
Wavesplit: End-to-End Speech Separation by Speaker Clustering
v1v2 (latest)

Wavesplit: End-to-End Speech Separation by Speaker Clustering

IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
20 February 2020
Neil Zeghidour
David Grangier
    VLM
ArXiv (abs)PDFHTML

Papers citing "Wavesplit: End-to-End Speech Separation by Speaker Clustering"

50 / 149 papers shown
MARS-Sep: Multimodal-Aligned Reinforced Sound Separation
MARS-Sep: Multimodal-Aligned Reinforced Sound Separation
Zihan Zhang
Xize Cheng
Zhennan Jiang
Dongjie Fu
Jingyuan Chen
Zhou Zhao
Tao Jin
149
0
0
12 Oct 2025
Neural Speech Separation with Parallel Amplitude and Phase Spectrum Estimation
Neural Speech Separation with Parallel Amplitude and Phase Spectrum Estimation
Fei Liu
Yang Ai
Zhen-Hua Ling
226
0
0
17 Sep 2025
A Study of the Scale Invariant Signal to Distortion Ratio in Speech Separation with Noisy References
A Study of the Scale Invariant Signal to Distortion Ratio in Speech Separation with Noisy References
Simon Dahl Jepsen
M. G. Christensen
Jesper Rindom Jensen
187
3
0
20 Aug 2025
Advances in Speech Separation: Techniques, Challenges, and Future Trends
Advances in Speech Separation: Techniques, Challenges, and Future Trends
Kai Li
Guo Chen
Wendi Sang
Yi Luo
Zhuo Chen
...
Shulin He
Zhong-Qiu Wang
Andong Li
Z. Wu
Xiaolin Hu
AI4TS
230
7
0
14 Aug 2025
SpectroStream: A Versatile Neural Codec for General Audio
SpectroStream: A Versatile Neural Codec for General Audio
Yunpeng Li
Kehang Han
Brian McWilliams
Zalan Borsos
Marco Tagliasacchi
115
5
0
07 Aug 2025
Whilter: A Whisper-based Data Filter for "In-the-Wild" Speech Corpora Using Utterance-level Multi-Task Classification
Whilter: A Whisper-based Data Filter for "In-the-Wild" Speech Corpora Using Utterance-level Multi-Task Classification
William Ravenscroft
George Close
Kit Bower-Morris
Jamie Stacey
Dmitry Sityaev
Kris Y. Hong
282
1
0
29 Jul 2025
Plug-and-Play Co-Occurring Face Attention for Robust Audio-Visual Speaker Extraction
Plug-and-Play Co-Occurring Face Attention for Robust Audio-Visual Speaker Extraction
Zexu Pan
Shengkui Zhao
Tingting Wang
Kun Zhou
Yukun Ma
Chong Zhang
B. Ma
279
0
0
27 May 2025
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Yuzhu Wang
Archontis Politis
Konstantinos Drossos
Maria Sandsten
212
1
0
22 May 2025
Listen to Extract: Onset-Prompted Target Speaker Extraction
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Zehao Wang
394
3
0
08 May 2025
SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation
SepALM: Audio Language Models Are Error Correctors for Robust Speech SeparationInternational Joint Conference on Artificial Intelligence (IJCAI), 2025
Zhaoxi Mu
Xinyu Yang
Gang Wang
AuLLMKELMVLM
531
2
0
06 May 2025
Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction
Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech ExtractionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Minsu Kim
Rodrigo Mira
Honglie Chen
Stavros Petridis
Maja Pantic
347
3
0
13 Mar 2025
EDSep: An Effective Diffusion-Based Method for Speech Source Separation
EDSep: An Effective Diffusion-Based Method for Speech Source SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Jinwei Dong
Xinsheng Wang
Qirong Mao
347
5
0
28 Jan 2025
Beyond Speaker Identity: Text Guided Target Speech Extraction
Beyond Speaker Identity: Text Guided Target Speech ExtractionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Mingyue Huo
Abhinav Jain
Cong Phuoc Huynh
Fanjie Kong
Pichao Wang
Zhu Liu
Vimal Bhat
282
8
0
17 Jan 2025
Task-Aware Unified Source Separation
Task-Aware Unified Source Separation
Kohei Saijo
Janek Ebbers
François Germain
Gordon Wichern
Jonathan Le Roux
316
8
0
31 Oct 2024
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
Xize Cheng
Siqi Zheng
Zehan Wang
Minghui Fang
Ziang Zhang
...
Tianhao Shen
Shengpeng Ji
Jialong Zuo
Tao Jin
Zhou Zhao
261
14
0
28 Oct 2024
SepMamba: State-space models for speaker separation using Mamba
SepMamba: State-space models for speaker separation using Mamba
Thor Højhus Avenstrup
Boldizsár Elek
István László Mádi
András Bence Schin
Morten Mørup
Bjørn Sand Jensen
Kenny Falkær Olsen
Mamba
267
7
0
28 Oct 2024
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target
  Speaker Extraction
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker ExtractionInterspeech (Interspeech), 2024
Shuai Wang
Ke Zhang
Shaoxiong Lin
Junjie Li
Xuefei Wang
Meng Ge
Jianwei Yu
Yanmin Qian
Haizhou Li
234
21
0
24 Sep 2024
Compositional Audio Representation Learning
Compositional Audio Representation LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Sripathi Sridhar
Mark Cartwright
AI4TS
540
1
0
15 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
USEF-TSE: Universal Speaker Embedding Free Target Speaker ExtractionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024
Bang Zeng
Ming Li
489
20
0
04 Sep 2024
Improving Generalization of Speech Separation in Real-World Scenarios:
  Strategies in Simulation, Optimization, and Evaluation
Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and EvaluationInterspeech (Interspeech), 2024
Kai Chen
Jiaqi Su
Taylor Berg-Kirkpatrick
Shlomo Dubnov
Zeyu Jin
183
4
0
28 Aug 2024
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech
  Separation and Enhancement
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and EnhancementInternational Workshop on Acoustic Signal Enhancement (IWAENC), 2024
Kohei Saijo
Gordon Wichern
François G. Germain
Zexu Pan
Jonathan Le Roux
190
42
0
06 Aug 2024
Towards a Universal Method for Meaningful Signal Detection
Towards a Universal Method for Meaningful Signal Detection
Louis Mahon
219
3
0
28 Jul 2024
Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Hyunseok Oh
Juheon Yi
Youngki Lee
308
4
0
01 Jul 2024
Song Data Cleansing for End-to-End Neural Singer Diarization Using
  Neural Analysis and Synthesis Framework
Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework
Hokuto Munakata
Ryo Terashima
Yusuke Fujita
235
0
0
24 Jun 2024
Transcription-Free Fine-Tuning of Speech Separation Models for Noisy and
  Reverberant Multi-Speaker Automatic Speech Recognition
Transcription-Free Fine-Tuning of Speech Separation Models for Noisy and Reverberant Multi-Speaker Automatic Speech Recognition
William Ravenscroft
George Close
Stefan Goetze
Thomas Hain
Mohammad Soleymanpour
Anurag Chowdhury
Mark C. Fuhs
307
2
0
13 Jun 2024
MambaMixer: Efficient Selective State Space Models with Dual Token and
  Channel Selection
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection
Ali Behrouz
Michele Santacatterina
Ramin Zabih
525
49
0
29 Mar 2024
Dual-path Mamba: Short and Long-term Bidirectional Selective Structured
  State Space Models for Speech Separation
Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation
Xilin Jiang
Cong Han
N. Mesgarani
Mamba
317
81
0
27 Mar 2024
CrossNet: Leveraging Global, Cross-Band, Narrow-Band, and Positional
  Encoding for Single- and Multi-Channel Speaker Separation
CrossNet: Leveraging Global, Cross-Band, Narrow-Band, and Positional Encoding for Single- and Multi-Channel Speaker Separation
Vahid Ahmadi Kalkhorani
DeLiang Wang
266
7
0
06 Mar 2024
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by
  Magnitude Conditioning
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning
Kuan-Hsun Ho
J. Hung
Berlin Chen
219
0
0
04 Mar 2024
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down
  Fusion
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down FusionInternational Conference on Information and Software Technologies (ICIST), 2023
Samuel Pegg
Kai Li
Xiaolin Hu
291
2
0
25 Jan 2024
Boosting Unknown-number Speaker Separation with Transformer
  Decoder-based Attractor
Boosting Unknown-number Speaker Separation with Transformer Decoder-based AttractorIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Younglo Lee
Shukjae Choi
Byeonghak Kim
Zhong-Qiu Wang
Shinji Watanabe
MoE
211
21
0
23 Jan 2024
Single-Microphone Speaker Separation and Voice Activity Detection in
  Noisy and Reverberant Environments
Single-Microphone Speaker Separation and Voice Activity Detection in Noisy and Reverberant Environments
Renana Opochinsky
Mordehay Moradi
Sharon Gannot
248
10
0
07 Jan 2024
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for
  Enhanced Time-Domain Monaural Speech Separation
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation
Shengkui Zhao
Yukun Ma
Chongjia Ni
Chong Zhang
Hao Wang
Trung Hieu Nguyen
Kun Zhou
J. Yip
Dianwen Ng
Bin Ma
324
74
0
19 Dec 2023
Improving Label Assignments Learning by Dynamic Sample Dropout Combined
  with Layer-wise Optimization in Speech Separation
Improving Label Assignments Learning by Dynamic Sample Dropout Combined with Layer-wise Optimization in Speech SeparationInterspeech (Interspeech), 2023
Chenyu Gao
Yue Gu
I. Marsic
330
0
0
20 Nov 2023
Multi-channel Conversational Speaker Separation via Neural Diarization
Multi-channel Conversational Speaker Separation via Neural DiarizationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
H. Taherian
DeLiang Wang
BDL
262
25
0
15 Nov 2023
On Time Domain Conformer Models for Monaural Speech Separation in Noisy
  Reverberant Acoustic Environments
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic EnvironmentsAutomatic Speech Recognition & Understanding (ASRU), 2023
William Ravenscroft
Stefan Goetze
Thomas Hain
264
7
0
09 Oct 2023
SPGM: Prioritizing Local Features for enhanced speech separation
  performance
SPGM: Prioritizing Local Features for enhanced speech separation performanceIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
J. Yip
Shengkui Zhao
Yukun Ma
Chongjia Ni
Chong Zhang
...
Trung Hieu Nguyen
Kun Zhou
Dianwen Ng
Eng Siong Chng
B. Ma
MoEVLM
300
6
0
22 Sep 2023
Sampling-Frequency-Independent Universal Sound Separation
Sampling-Frequency-Independent Universal Sound Separation
Tomohiko Nakamura
Kohei Yatabe
201
0
0
22 Sep 2023
Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting Transcription
Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting TranscriptionSpoken Language Technology Workshop (SLT), 2023
Peter Vieting
Simon Berger
Thilo von Neumann
Christoph Boeddeker
Ralf Schluter
Reinhold Haeb-Umbach
315
0
0
15 Sep 2023
Analysis of Speech Separation Performance Degradation on Emotional
  Speech Mixtures
Analysis of Speech Separation Performance Degradation on Emotional Speech MixturesAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2023
J. Yip
Dianwen Ng
Bin Ma
Chng Eng Siong
255
1
0
14 Sep 2023
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual
  Speech Separation
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech SeparationInternational Conference on Machine Learning (ICML), 2023
Kai Li
Run Yang
Fuchun Sun
Xiaolin Hu
342
26
0
16 Aug 2023
Complete and separate: Conditional separation with missing target source
  attribute completion
Complete and separate: Conditional separation with missing target source attribute completionIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
Dimitrios Bralios
Efthymios Tzinis
Paris Smaragdis
235
0
0
27 Jul 2023
Mixture Encoder for Joint Speech Separation and Recognition
Mixture Encoder for Joint Speech Separation and RecognitionInterspeech (Interspeech), 2023
Simon Berger
Peter Vieting
Christoph Boeddeker
Ralf Schluter
Reinhold Häb-Umbach
238
8
0
21 Jun 2023
Algorithms of Sampling-Frequency-Independent Layers for Non-integer
  Strides
Algorithms of Sampling-Frequency-Independent Layers for Non-integer StridesEuropean Signal Processing Conference (EUSIPCO), 2023
Kanami Imamura
Tomohiko Nakamura
Norihiro Takamune
Kohei Yatabe
Hiroshi Saruwatari
177
5
0
19 Jun 2023
A Teacher-Student approach for extracting informative speaker embeddings
  from speech mixtures
A Teacher-Student approach for extracting informative speaker embeddings from speech mixturesInterspeech (Interspeech), 2023
Tobias Cord-Landwehr
Christoph Boeddeker
Catalin Zorila
R. Doddipatla
Reinhold Haeb-Umbach
374
5
0
01 Jun 2023
UNSSOR: Unsupervised Neural Speech Separation by Leveraging
  Over-determined Training Mixtures
UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training MixturesNeural Information Processing Systems (NeurIPS), 2023
Zhong-Qiu Wang
Shinji Watanabe
298
19
0
31 May 2023
An Experimental Review of Speaker Diarization methods with application
  to Two-Speaker Conversational Telephone Speech recordings
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordingsComputer Speech and Language (CSL), 2023
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
Alessio Brutti
S. Squartini
274
17
0
29 May 2023
A Neural State-Space Model Approach to Efficient Speech Separation
A Neural State-Space Model Approach to Efficient Speech Separation
Chen Chen
Chao-Han Huck Yang
Kai Li
Yuchen Hu
Pin-Jui Ku
Chng Eng Siong
170
12
0
26 May 2023
Towards Solving Cocktail-Party: The First Method to Build a Realistic
  Dataset with Ground Truths for Speech Separation
Towards Solving Cocktail-Party: The First Method to Build a Realistic Dataset with Ground Truths for Speech Separation
Rawad Melhem
Assef Jafar
Oumayma Al Dakkak
234
1
0
25 May 2023
Noise-Aware Speech Separation with Contrastive Learning
Noise-Aware Speech Separation with Contrastive LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Zizheng Zhang
Cheng Chen
Hsin-Hung Chen
Xiang Liu
Yuchen Hu
Eng Siong Chng
277
8
0
18 May 2023
123
Next
Page 1 of 3