ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.04306
  4. Cited By
Deep clustering: Discriminative embeddings for segmentation and
  separation

Deep clustering: Discriminative embeddings for segmentation and separation

18 August 2015
J. Hershey
Zhuo Chen
Jonathan Le Roux
Shinji Watanabe
ArXiv (abs)PDFHTML

Papers citing "Deep clustering: Discriminative embeddings for segmentation and separation"

50 / 357 papers shown
Title
SDR -- Medium Rare with Fast Computations
SDR -- Medium Rare with Fast Computations
Robin Scheibler
66
17
0
13 Oct 2021
Multi-channel Narrow-band Deep Speech Separation with Full-band
  Permutation Invariant Training
Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training
Changsheng Quan
Xiaofei Li
63
16
0
12 Oct 2021
Fetal Gender Identification using Machine and Deep Learning Algorithms
  on Phonocardiogram Signals
Fetal Gender Identification using Machine and Deep Learning Algorithms on Phonocardiogram Signals
Reza Khanmohammadi
Mitra Sadat Mirshafiee
M. Ghassemi
Tuka Alhanai
75
5
0
10 Oct 2021
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in
  High-order Latent Domain
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain
Zengwei Yao
Wenjie Pei
Fanglin Chen
Guangming Lu
David C. Zhang
74
12
0
10 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for
  End-to-End Speech Recognition
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Xuankai Chang
Takashi Maekaku
Pengcheng Guo
Jing Shi
Yen-Ju Lu
...
Tianzi Wang
Shu-Wen Yang
Yu Tsao
Hung-yi Lee
Shinji Watanabe
SSLAI4TS
78
81
0
09 Oct 2021
Location-based training for multi-channel talker-independent speaker
  separation
Location-based training for multi-channel talker-independent speaker separation
H. Taherian
Ke Tan
DeLiang Wang
57
10
0
08 Oct 2021
Git: Clustering Based on Graph of Intensity Topology
Git: Clustering Based on Graph of Intensity Topology
Zhangyang Gao
Haitao Lin
Cheng Tan
Lirong Wu
Stan. Z Li
65
7
0
04 Oct 2021
FastMVAE2: On improving and accelerating the fast variational
  autoencoder-based source separation algorithm for determined mixtures
FastMVAE2: On improving and accelerating the fast variational autoencoder-based source separation algorithm for determined mixtures
Li Li
Hirokazu Kameoka
S. Makino
DRL
77
8
0
28 Sep 2021
Visual Scene Graphs for Audio Source Separation
Visual Scene Graphs for Audio Source Separation
Moitreya Chatterjee
Jonathan Le Roux
Narendra Ahuja
A. Cherian
105
37
0
24 Sep 2021
Improving Deep Metric Learning by Divide and Conquer
Improving Deep Metric Learning by Divide and Conquer
A. Sanakoyeu
Pingchuan Ma
Vadim Tschernezki
Bjorn Ommer
95
14
0
09 Sep 2021
Convolutive Prediction for Monaural Speech Dereverberation and
  Noisy-Reverberant Speaker Separation
Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation
Zhong-Qiu Wang
Gordon Wichern
Jonathan Le Roux
97
33
0
16 Aug 2021
Convolutive Prediction for Reverberant Speech Separation
Convolutive Prediction for Reverberant Speech Separation
Zhong-Qiu Wang
Gordon Wichern
Jonathan Le Roux
87
12
0
16 Aug 2021
On The Compensation Between Magnitude and Phase in Speech Separation
On The Compensation Between Magnitude and Phase in Speech Separation
Zhong-Qiu Wang
Gordon Wichern
Jonathan Le Roux
79
74
0
11 Aug 2021
A Unified Model for Zero-shot Music Source Separation, Transcription and
  Synthesis
A Unified Model for Zero-shot Music Source Separation, Transcription and Synthesis
Liwei Lin
Qiuqiang Kong
Junyan Jiang
Gus Xia
64
26
0
07 Aug 2021
The Right to Talk: An Audio-Visual Transformer Approach
The Right to Talk: An Audio-Visual Transformer Approach
Thanh-Dat Truong
C. Duong
T. D. Vu
H. Pham
Bhiksha Raj
Ngan Le
Khoa Luu
120
36
0
06 Aug 2021
Graph-PIT: Generalized permutation invariant training for continuous
  separation of arbitrary numbers of speakers
Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
65
23
0
30 Jul 2021
Speeding Up Permutation Invariant Training for Source Separation
Speeding Up Permutation Invariant Training for Source Separation
Thilo von Neumann
Christoph Boeddeker
K. Kinoshita
Marc Delcroix
Reinhold Haeb-Umbach
58
6
0
30 Jul 2021
Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint
  Optimization
Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization
Haici Yang
Shivani Firodiya
Nicholas J. Bryan
Minje Kim
69
7
0
28 Jul 2021
Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency
  Domain Features and a Pre-trained Acoustic Model
Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency Domain Features and a Pre-trained Acoustic Model
Quandong Wang
Junnan Wu
Zhao Yan
Sichong Qian
Liyong Guo
Lichun Fan
Weiji Zhuang
Peng Gao
Yujun Wang
66
0
0
23 Jul 2021
Improving Reverberant Speech Separation with Multi-stage Training and
  Curriculum Learning
Improving Reverberant Speech Separation with Multi-stage Training and Curriculum Learning
R. Aralikatti
Anton Ratnarajah
Zhenyu Tang
Tianyi Zhou
33
2
0
19 Jul 2021
Localization Based Sequential Grouping for Continuous Speech Separation
Localization Based Sequential Grouping for Continuous Speech Separation
Zhong-Qiu Wang
DeLiang Wang
80
12
0
14 Jul 2021
Multi-Task Audio Source Separation
Multi-Task Audio Source Separation
Lu Zhang
Chenxing Li
Feng Deng
Xiaorui Wang
67
9
0
14 Jul 2021
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker
  Overlapped Speech Recognition and Speaker Attribute Estimation
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Ryo Masumura
Daiki Okamura
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
55
7
0
04 Jul 2021
Towards Neural Diarization for Unlimited Numbers of Speakers Using
  Global and Local Attractors
Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors
Shota Horiguchi
Shinji Watanabe
Leibny Paola García-Perera
Yawen Xue
Yuki Takashima
Yohei Kawaguchi
79
38
0
04 Jul 2021
A Novel Self-Learning Framework for Bladder Cancer Grading Using
  Histopathological Images
A Novel Self-Learning Framework for Bladder Cancer Grading Using Histopathological Images
Gabriel García
A. Esteve
Adrián Colomer
David Ramos
Valery Naranjo
43
12
0
25 Jun 2021
Deep neural network Based Low-latency Speech Separation with Asymmetric
  analysis-Synthesis Window Pair
Deep neural network Based Low-latency Speech Separation with Asymmetric analysis-Synthesis Window Pair
Shanshan Wang
Gaurav Naithani
Archontis Politis
Tuomas Virtanen
61
10
0
22 Jun 2021
Encoder-Decoder Based Attractors for End-to-End Neural Diarization
Encoder-Decoder Based Attractors for End-to-End Neural Diarization
Shota Horiguchi
Yusuke Fujita
Shinji Watanabe
Yawen Xue
Leibny Paola García-Perera
74
68
0
20 Jun 2021
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and
  Conditional Speaker Chain
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain
Pengcheng Guo
Xuankai Chang
Shinji Watanabe
Lei Xie
48
19
0
16 Jun 2021
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech
  Separation
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation
Jisi Zhang
Catalin Zorila
R. Doddipatla
Jon Barker
54
22
0
15 Jun 2021
Learning Audio-Visual Dereverberation
Learning Audio-Visual Dereverberation
Changan Chen
Wei-Ju Sun
David Harwath
Kristen Grauman
80
32
0
14 Jun 2021
WASE: Learning When to Attend for Speaker Extraction in Cocktail Party
  Environments
WASE: Learning When to Attend for Speaker Extraction in Cocktail Party Environments
Yunzhe Hao
Jiaming Xu
Peng Zhang
Bo Xu
32
17
0
13 Jun 2021
Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural
  Diarization
Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization
Yuki Takashima
Yusuke Fujita
Shota Horiguchi
Shinji Watanabe
Paola García
Kenji Nagamatsu
82
15
0
09 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit
SpeechBrain: A General-Purpose Speech Toolkit
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
...
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
111
769
0
08 Jun 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker
  Detection in the Wild
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
79
46
0
07 Jun 2021
Lightweight Dual-channel Target Speaker Separation for Mobile Voice
  Communication
Lightweight Dual-channel Target Speaker Separation for Mobile Voice Communication
Yuanyuan Bao
Yanze Xu
Na Xu
Wenjing Yang
Hongfeng Li
Shicong Li
Y. Jia
Fei Xiang
Jincheng He
Ming Li
87
1
0
05 Jun 2021
Manifold-Aware Deep Clustering: Maximizing Angles between Embedding
  Vectors Based on Regular Simplex
Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex
Keitaro Tanaka
Ryosuke Sawata
Shusuke Takahashi
36
0
0
04 Jun 2021
Should We Always Separate?: Switching Between Enhanced and Observed
  Signals for Overlapping Speech Recognition
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Takafumi Moriya
Naoyuki Kamo
68
23
0
02 Jun 2021
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming
  In-the-Wild Unsupervised Sound Separation
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation
Scott Wisdom
A. Jansen
Ron J. Weiss
Hakan Erdogan
J. Hershey
77
27
0
01 Jun 2021
Multi-Scale Temporal Convolution Network for Classroom Voice Detection
Multi-Scale Temporal Convolution Network for Classroom Voice Detection
Lu Ma
Xintian Wang
Song Yang
Y. Gong
Zhongqin Wu
28
1
0
31 May 2021
Many-Speakers Single Channel Speech Separation with Optimal Permutation
  Training
Many-Speakers Single Channel Speech Separation with Optimal Permutation Training
Shaked Dovrat
Eliya Nachmani
Lior Wolf
VLM
94
22
0
18 Apr 2021
Visually Guided Sound Source Separation and Localization using
  Self-Supervised Motion Representations
Visually Guided Sound Source Separation and Localization using Self-Supervised Motion Representations
Lingyu Zhu
Esa Rahtu
81
27
0
17 Apr 2021
MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation
MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation
Xiyun Li
Yong-mei Xu
Meng Yu
Shi-Xiong Zhang
Jiaming Xu
Bo Xu
Dong Yu
52
14
0
17 Apr 2021
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound
  Separation
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
Yapeng Tian
Di Hu
Chenliang Xu
ObjD
85
88
0
05 Apr 2021
Efficient Personalized Speech Enhancement through Self-Supervised
  Learning
Efficient Personalized Speech Enhancement through Self-Supervised Learning
Aswin Sivaraman
Minje Kim
67
20
0
05 Apr 2021
Target Speaker Verification with Selective Auditory Attention for Single
  and Multi-talker Speech
Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech
Chenglin Xu
Wei Rao
Jibin Wu
Haizhou Li
66
32
0
30 Mar 2021
On TasNet for Low-Latency Single-Speaker Speech Enhancement
On TasNet for Low-Latency Single-Speaker Speech Enhancement
Morten Kolbæk
Zheng-Hua Tan
S. H. Jensen
Jesper Jensen
81
2
0
27 Mar 2021
Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual
  Speech Separation
Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation
Jiyoung Lee
Soo-Whan Chung
Sunok Kim
Hong-Goo Kang
Kwanghoon Sohn
59
51
0
25 Mar 2021
Blind Speech Separation and Dereverberation using Neural Beamforming
Blind Speech Separation and Dereverberation using Neural Beamforming
Lukas Pfeifenberger
Franz Pernkopf
36
5
0
24 Mar 2021
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
Ryo Masumura
54
8
0
02 Mar 2021
Sandglasset: A Light Multi-Granularity Self-attentive Network For
  Time-Domain Speech Separation
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
AI4TS
121
49
0
01 Mar 2021
Previous
12345678
Next