Deep clustering: Discriminative embeddings for segmentation and separation

18 August 2015

Papers citing "Deep clustering: Discriminative embeddings for segmentation and separation"

50 / 357 papers shown

Title
SDR -- Medium Rare with Fast Computations Robin Scheibler 66 17 0 13 Oct 2021
Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training Changsheng Quan Xiaofei Li 63 16 0 12 Oct 2021
Fetal Gender Identification using Machine and Deep Learning Algorithms on Phonocardiogram Signals Reza Khanmohammadi Mitra Sadat Mirshafiee M. Ghassemi Tuka Alhanai 75 5 0 10 Oct 2021
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain Zengwei Yao Wenjie Pei Fanglin Chen Guangming Lu David C. Zhang 74 12 0 10 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition Xuankai Chang Takashi Maekaku Pengcheng Guo Jing Shi Yen-Ju Lu ... Tianzi Wang Shu-Wen Yang Yu Tsao Hung-yi Lee Shinji Watanabe SSL AI4TS 78 81 0 09 Oct 2021
Location-based training for multi-channel talker-independent speaker separation H. Taherian Ke Tan DeLiang Wang 57 10 0 08 Oct 2021
Git: Clustering Based on Graph of Intensity Topology Zhangyang Gao Haitao Lin Cheng Tan Lirong Wu Stan. Z Li 65 7 0 04 Oct 2021
FastMVAE2: On improving and accelerating the fast variational autoencoder-based source separation algorithm for determined mixtures Li Li Hirokazu Kameoka S. Makino DRL 77 8 0 28 Sep 2021
Visual Scene Graphs for Audio Source Separation Moitreya Chatterjee Jonathan Le Roux Narendra Ahuja A. Cherian 105 37 0 24 Sep 2021
Improving Deep Metric Learning by Divide and Conquer A. Sanakoyeu Pingchuan Ma Vadim Tschernezki Bjorn Ommer 95 14 0 09 Sep 2021
Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation Zhong-Qiu Wang Gordon Wichern Jonathan Le Roux 97 33 0 16 Aug 2021
Convolutive Prediction for Reverberant Speech Separation Zhong-Qiu Wang Gordon Wichern Jonathan Le Roux 87 12 0 16 Aug 2021
On The Compensation Between Magnitude and Phase in Speech Separation Zhong-Qiu Wang Gordon Wichern Jonathan Le Roux 79 74 0 11 Aug 2021
A Unified Model for Zero-shot Music Source Separation, Transcription and Synthesis Liwei Lin Qiuqiang Kong Junyan Jiang Gus Xia 64 26 0 07 Aug 2021
The Right to Talk: An Audio-Visual Transformer Approach Thanh-Dat Truong C. Duong T. D. Vu H. Pham Bhiksha Raj Ngan Le Khoa Luu 120 36 0 06 Aug 2021
Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers Thilo von Neumann K. Kinoshita Christoph Boeddeker Marc Delcroix Reinhold Haeb-Umbach 65 23 0 30 Jul 2021
Speeding Up Permutation Invariant Training for Source Separation Thilo von Neumann Christoph Boeddeker K. Kinoshita Marc Delcroix Reinhold Haeb-Umbach 58 6 0 30 Jul 2021
Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization Haici Yang Shivani Firodiya Nicholas J. Bryan Minje Kim 69 7 0 28 Jul 2021
Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency Domain Features and a Pre-trained Acoustic Model Quandong Wang Junnan Wu Zhao Yan Sichong Qian Liyong Guo Lichun Fan Weiji Zhuang Peng Gao Yujun Wang 66 0 0 23 Jul 2021
Improving Reverberant Speech Separation with Multi-stage Training and Curriculum Learning R. Aralikatti Anton Ratnarajah Zhenyu Tang Tianyi Zhou 33 2 0 19 Jul 2021
Localization Based Sequential Grouping for Continuous Speech Separation Zhong-Qiu Wang DeLiang Wang 80 12 0 14 Jul 2021
Multi-Task Audio Source Separation Lu Zhang Chenxing Li Feng Deng Xiaorui Wang 67 9 0 14 Jul 2021
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation Ryo Masumura Daiki Okamura Naoki Makishima Mana Ihori Akihiko Takashima Tomohiro Tanaka Shota Orihashi 55 7 0 04 Jul 2021
Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors Shota Horiguchi Shinji Watanabe Leibny Paola García-Perera Yawen Xue Yuki Takashima Yohei Kawaguchi 79 38 0 04 Jul 2021
A Novel Self-Learning Framework for Bladder Cancer Grading Using Histopathological Images Gabriel García A. Esteve Adrián Colomer David Ramos Valery Naranjo 43 12 0 25 Jun 2021
Deep neural network Based Low-latency Speech Separation with Asymmetric analysis-Synthesis Window Pair Shanshan Wang Gaurav Naithani Archontis Politis Tuomas Virtanen 61 10 0 22 Jun 2021
Encoder-Decoder Based Attractors for End-to-End Neural Diarization Shota Horiguchi Yusuke Fujita Shinji Watanabe Yawen Xue Leibny Paola García-Perera 74 68 0 20 Jun 2021
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain Pengcheng Guo Xuankai Chang Shinji Watanabe Lei Xie 48 19 0 16 Jun 2021
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation Jisi Zhang Catalin Zorila R. Doddipatla Jon Barker 54 22 0 15 Jun 2021
Learning Audio-Visual Dereverberation Changan Chen Wei-Ju Sun David Harwath Kristen Grauman 80 32 0 14 Jun 2021
WASE: Learning When to Attend for Speaker Extraction in Cocktail Party Environments Yunzhe Hao Jiaming Xu Peng Zhang Bo Xu 32 17 0 13 Jun 2021
Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization Yuki Takashima Yusuke Fujita Shota Horiguchi Shinji Watanabe Paola García Kenji Nagamatsu 82 15 0 09 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit Mirco Ravanelli Titouan Parcollet Peter William VanHarn Plantinga Aku Rouhe Samuele Cornell ... William Aris Hwidong Na Yan Gao R. Mori Yoshua Bengio 111 769 0 08 Jun 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild Okan Kopuklu Maja Taseska Gerhard Rigoll 3DV 79 46 0 07 Jun 2021
Lightweight Dual-channel Target Speaker Separation for Mobile Voice Communication Yuanyuan Bao Yanze Xu Na Xu Wenjing Yang Hongfeng Li Shicong Li Y. Jia Fei Xiang Jincheng He Ming Li 87 1 0 05 Jun 2021
Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex Keitaro Tanaka Ryosuke Sawata Shusuke Takahashi 36 0 0 04 Jun 2021
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition Hiroshi Sato Tsubasa Ochiai Marc Delcroix K. Kinoshita Takafumi Moriya Naoyuki Kamo 68 23 0 02 Jun 2021
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation Scott Wisdom A. Jansen Ron J. Weiss Hakan Erdogan J. Hershey 77 27 0 01 Jun 2021
Multi-Scale Temporal Convolution Network for Classroom Voice Detection Lu Ma Xintian Wang Song Yang Y. Gong Zhongqin Wu 28 1 0 31 May 2021
Many-Speakers Single Channel Speech Separation with Optimal Permutation Training Shaked Dovrat Eliya Nachmani Lior Wolf VLM 94 22 0 18 Apr 2021
Visually Guided Sound Source Separation and Localization using Self-Supervised Motion Representations Lingyu Zhu Esa Rahtu 81 27 0 17 Apr 2021
MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation Xiyun Li Yong-mei Xu Meng Yu Shi-Xiong Zhang Jiaming Xu Bo Xu Dong Yu 52 14 0 17 Apr 2021
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation Yapeng Tian Di Hu Chenliang Xu ObjD 85 88 0 05 Apr 2021
Efficient Personalized Speech Enhancement through Self-Supervised Learning Aswin Sivaraman Minje Kim 67 20 0 05 Apr 2021
Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech Chenglin Xu Wei Rao Jibin Wu Haizhou Li 66 32 0 30 Mar 2021
On TasNet for Low-Latency Single-Speaker Speech Enhancement Morten Kolbæk Zheng-Hua Tan S. H. Jensen Jesper Jensen 81 2 0 27 Mar 2021
Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation Jiyoung Lee Soo-Whan Chung Sunok Kim Hong-Goo Kang Kwanghoon Sohn 59 51 0 25 Mar 2021
Blind Speech Separation and Dereverberation using Neural Beamforming Lukas Pfeifenberger Franz Pernkopf 36 5 0 24 Mar 2021
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss Naoki Makishima Mana Ihori Akihiko Takashima Tomohiro Tanaka Shota Orihashi Ryo Masumura 54 8 0 02 Mar 2021
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation Max W. Y. Lam Jun Wang Dan Su Dong Yu AI4TS 121 49 0 01 Mar 2021