Single-Channel Multi-Speaker Separation using Deep Clustering

7 July 2016

Papers citing "Single-Channel Multi-Speaker Separation using Deep Clustering"

50 / 77 papers shown

Title
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction Bang Zeng Ming Li 37 2 0 04 Sep 2024
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning Kuan-Hsun Ho J. Hung Berlin Chen 42 0 0 04 Mar 2024
Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition Peng Shen Xugang Lu Hisashi Kawai 35 1 0 18 Dec 2023
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments William Ravenscroft Stefan Goetze Thomas Hain 28 7 0 09 Oct 2023
Complete and separate: Conditional separation with missing target source attribute completion Dimitrios Bralios Efthymios Tzinis Paris Smaragdis 35 0 0 27 Jul 2023
Mixture Encoder for Joint Speech Separation and Recognition Simon Berger Peter Vieting Christoph Boeddeker Ralf Schluter Reinhold Häb-Umbach 21 6 0 21 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition Desh Raj Daniel Povey Sanjeev Khudanpur VLM 34 9 0 18 Jun 2023
Multi-Channel Masking with Learnable Filterbank for Sound Source Separation Wang Dai A. Politis Tuomas Virtanen 28 0 0 14 Mar 2023
Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation Yinhao Xu Jian Zhou L. Tao H. Kwan 30 0 0 14 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of the art analysis P. Ochieng 30 21 0 01 Dec 2022
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation Zhongqiu Wang Samuele Cornell Shukjae Choi Younglo Lee Byeonghak Kim Shinji Watanabe 38 119 0 22 Nov 2022
Latent Iterative Refinement for Modular Source Separation Dimitrios Bralios Efthymios Tzinis Gordon Wichern Paris Smaragdis Jonathan Le Roux BDL 33 5 0 22 Nov 2022
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation William Ravenscroft Stefan Goetze Thomas Hain 33 11 0 27 Oct 2022
Heterogeneous Target Speech Separation Hyunjae Cho Wonbin Jung Junhyeok Lee Paris Smaragdis Sanghyun Woo 46 26 0 07 Apr 2022
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation Chenda Li Lei Yang Weiqin Wang Y. Qian 32 25 0 26 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech Ilya Sklyar A. Piunova Xianrui Zheng Yulan Liu 24 22 0 19 Dec 2021
Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks Nils L. Westhausen R. Huber Hannah Baumgartner Ragini Sinha J. Rennies B. Meyer 25 10 0 02 Nov 2021
TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation Tanzila Rahman Mengyu Yang Leonid Sigal ViT 29 8 0 26 Oct 2021
Personalized Speech Enhancement: New Models and Comprehensive Evaluation Sefik Emre Eskimez Takuya Yoshioka Huaming Wang Xiaofei Wang Zhuo Chen Xuedong Huang 32 62 0 18 Oct 2021
Visual Scene Graphs for Audio Source Separation Moitreya Chatterjee Jonathan Le Roux Narendra Ahuja A. Cherian 26 36 0 24 Sep 2021
Deep neural network Based Low-latency Speech Separation with Asymmetric analysis-Synthesis Window Pair Shanshan Wang Gaurav Naithani A. Politis Tuomas Virtanen 40 10 0 22 Jun 2021
Lightweight Dual-channel Target Speaker Separation for Mobile Voice Communication Yuanyuan Bao Yanze Xu Na Xu Wenjing Yang Hongfeng Li Shicong Li Y. Jia Fei Xiang Jincheng He Ming Li 30 1 0 05 Jun 2021
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation Scott Wisdom A. Jansen Ron J. Weiss Hakan Erdogan J. Hershey 38 26 0 01 Jun 2021
Many-Speakers Single Channel Speech Separation with Optimal Permutation Training Shaked Dovrat Eliya Nachmani Lior Wolf VLM 6 21 0 18 Apr 2021
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss Naoki Makishima Mana Ihori Akihiko Takashima Tomohiro Tanaka Shota Orihashi Ryo Masumura 30 8 0 02 Mar 2021
End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend Wangyou Zhang Christoph Boeddeker Shinji Watanabe Tomohiro Nakatani Marc Delcroix K. Kinoshita Tsubasa Ochiai Naoyuki Kamo Reinhold Haeb-Umbach Y. Qian 20 32 0 23 Feb 2021
Group Communication with Context Codec for Lightweight Source Separation Yi Luo Cong Han N. Mesgarani 26 20 0 14 Dec 2020
On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments Jisi Zhang Catalin Zorila R. Doddipatla Jon Barker 11 46 0 11 Nov 2020
Spoken Language Interaction with Robots: Research Issues and Recommendations, Report from the NSF Future Directions Workshop M. Marge C. Espy-Wilson Roger K. Moore 26 78 0 11 Nov 2020
Speaker Separation Using Speaker Inventories and Estimated Speech Peidong Wang Zhuo Chen DeLiang Wang Jinyu Li Jiawei Liu 38 11 0 20 Oct 2020
Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss Ziqiang Shi Rujie Liu Jiqing Han 11 7 0 06 Aug 2020
Speaker-Conditional Chain Model for Speech Separation and Extraction Jing Shi Jiaming Xu Yusuke Fujita Shinji Watanabe Bo Xu BDL 43 20 0 25 Jun 2020
Unsupervised Sound Separation Using Mixture Invariant Training Scott Wisdom Efthymios Tzinis Hakan Erdogan Ron J. Weiss K. Wilson J. Hershey 16 27 0 23 Jun 2020
Efficient Integration of Multi-channel Information for Speaker-independent Speech Separation Yuichiro Koyama Oluwafemi Azeez Bhiksha Raj 27 4 0 23 May 2020
Dual-Signal Transformation LSTM Network for Real-Time Noise Suppression Nils L. Westhausen B. Meyer 25 99 0 15 May 2020
SpEx+: A Complete Time Domain Speaker Extraction Network Meng Ge Chenglin Xu Longbiao Wang Chng Eng Siong J. Dang Haizhou Li 27 144 0 10 May 2020
Asteroid: the PyTorch-based audio source separation toolkit for researchers Manuel Pariente Samuele Cornell Joris Cosentino S. Sivasankaran Efthymios Tzinis ... Juan M. Martín-Donas David Ditter Ariel Frank Antoine Deleforge Emmanuel Vincent 27 151 0 08 May 2020
Determined BSS based on time-frequency masking and its application to harmonic vector analysis Kohei Yatabe Daichi Kitamura 32 26 0 29 Apr 2020
Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss Yi Luo N. Mesgarani 21 29 0 27 Mar 2020
Voice Separation with an Unknown Number of Multiple Speakers Eliya Nachmani Yossi Adi Lior Wolf 20 175 0 29 Feb 2020
Wavesplit: End-to-End Speech Separation by Speaker Clustering Neil Zeghidour David Grangier VLM 27 261 0 20 Feb 2020
Deep Audio-Visual Learning: A Survey Hao Zhu Mandi Luo Rui Wang A. Zheng Ran He 31 156 0 14 Jan 2020
Audio-visual Recognition of Overlapped speech for the LRS2 dataset Jianwei Yu Shi-Xiong Zhang Jian Wu Shahram Ghorbani Bo Wu Shiyin Kang Shansong Liu Xunying Liu Helen Meng Dong Yu 32 72 0 06 Jan 2020
CNN-LSTM models for Multi-Speaker Source Separation using Bayesian Hyper Parameter Optimization Jeroen Zegers Hugo Van hamme BDL 28 7 0 19 Dec 2019
End-to-end training of time domain audio separation and recognition Thilo von Neumann K. Kinoshita Lukas Drude Christoph Boeddeker Marc Delcroix Tomohiro Nakatani Reinhold Haeb-Umbach 25 34 0 18 Dec 2019
Mixup-breakdown: a consistency training method for improving generalization of speech separation models Max W. Y. Lam Jun Wang Dan Su Dong Yu 33 22 0 28 Oct 2019
Filterbank design for end-to-end speech separation Manuel Pariente Samuele Cornell Antoine Deleforge Emmanuel Vincent 26 69 0 23 Oct 2019
WHAMR!: Noisy and Reverberant Single-Channel Speech Separation Matthew Maciejewski Gordon Wichern E. McQuinn Jonathan Le Roux 14 179 0 22 Oct 2019
Two-Step Sound Source Separation: Training on Learned Latent Targets Efthymios Tzinis Shrikant Venkataramani Zhepei Wang Y. C. Sübakan Paris Smaragdis 19 64 0 22 Oct 2019
Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech T. Menne Ilya Sklyar Ralf Schluter Hermann Ney 22 35 0 09 May 2019