ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.09804
  4. Cited By
Two-Step Sound Source Separation: Training on Learned Latent Targets
v1v2 (latest)

Two-Step Sound Source Separation: Training on Learned Latent Targets

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
22 October 2019
Efthymios Tzinis
Shrikant Venkataramani
Zhepei Wang
Y. C. Sübakan
Paris Smaragdis
ArXiv (abs)PDFHTML

Papers citing "Two-Step Sound Source Separation: Training on Learned Latent Targets"

35 / 35 papers shown
Learning Linearity in Audio Consistency Autoencoders via Implicit Regularization
Learning Linearity in Audio Consistency Autoencoders via Implicit Regularization
Bernardo Torres
Manuel Moussallam
Gabriel Meseguer-Brocal
216
0
0
27 Oct 2025
Neural Speech Separation with Parallel Amplitude and Phase Spectrum Estimation
Neural Speech Separation with Parallel Amplitude and Phase Spectrum Estimation
Fei Liu
Yang Ai
Zhen-Hua Ling
113
0
0
17 Sep 2025
Advances in Speech Separation: Techniques, Challenges, and Future Trends
Advances in Speech Separation: Techniques, Challenges, and Future Trends
Kai Li
Guo Chen
Wendi Sang
Yi Luo
Zhuo Chen
...
Shulin He
Zhong-Qiu Wang
Andong Li
Z. Wu
Xiaolin Hu
AI4TS
119
4
0
14 Aug 2025
A Reference-free Metric for Language-Queried Audio Source Separation using Contrastive Language-Audio Pretraining
A Reference-free Metric for Language-Queried Audio Source Separation using Contrastive Language-Audio Pretraining
Feiyang Xiao
Jian Guan
Qiaoxi Zhu
Xubo Liu
Wenbo Wang
Shuhan Qi
Kejia Zhang
Jianyuan Sun
Wenwu Wang
322
11
0
06 Jul 2024
Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Hyunseok Oh
Juheon Yi
Youngki Lee
188
4
0
01 Jul 2024
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition
  Neural Network
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
Yuhang He
Zhuangzhuang Dai
Long Chen
Niki Trigoni
Andrew Markham
193
2
0
26 Dec 2023
Speech Separation based on Contrastive Learning and Deep Modularization
Speech Separation based on Contrastive Learning and Deep Modularization
Peter Ochieng
SSL
270
0
0
18 May 2023
Learning Semantic-Agnostic and Spatial-Aware Representation for
  Generalizable Visual-Audio Navigation
Learning Semantic-Agnostic and Spatial-Aware Representation for Generalizable Visual-Audio NavigationIEEE Robotics and Automation Letters (RA-L), 2023
Hongchen Wang
Yuxuan Wang
Fangwei Zhong
Min-Yu Wu
Jianwei Zhang
Yizhou Wang
Hao Dong
388
10
0
21 Apr 2023
Scaling strategies for on-device low-complexity source separation with
  Conv-Tasnet
Scaling strategies for on-device low-complexity source separation with Conv-Tasnet
Mohamed Nabih Ali
Francesco Paissan
Daniele Falavigna
Alessio Brutti
149
2
0
06 Mar 2023
MossFormer: Pushing the Performance Limit of Monaural Speech Separation
  using Gated Single-Head Transformer with Convolution-Augmented Joint
  Self-Attentions
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-AttentionsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Shengkui Zhao
Bin Ma
213
74
0
23 Feb 2023
Deep neural network techniques for monaural speech enhancement: state of
  the art analysis
Deep neural network techniques for monaural speech enhancement: state of the art analysisArtificial Intelligence Review (Artif Intell Rev), 2022
P. Ochieng
269
35
0
01 Dec 2022
Latent Iterative Refinement for Modular Source Separation
Latent Iterative Refinement for Modular Source SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Dimitrios Bralios
Efthymios Tzinis
Gordon Wichern
Paris Smaragdis
Jonathan Le Roux
BDL
169
11
0
22 Nov 2022
Speech Enhancement with Fullband-Subband Cross-Attention Network
Speech Enhancement with Fullband-Subband Cross-Attention NetworkInterspeech (Interspeech), 2022
Jun Chen
Wei Rao
Zehao Wang
Zhiyong Wu
Yannan Wang
Tao Yu
Shidong Shang
Helen M. Meng
112
19
0
10 Nov 2022
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated
  Open-Domain On-Screen Sound Separation
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound SeparationEuropean Conference on Computer Vision (ECCV), 2022
Efthymios Tzinis
Scott Wisdom
Tal Remez
J. Hershey
297
33
0
20 Jul 2022
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning
  to Separate
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to SeparateInterspeech (Interspeech), 2022
Nabarun Goswami
Tatsuya Harada
162
5
0
13 Jul 2022
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech
  Separation
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech SeparationInterspeech (Interspeech), 2022
Jian Luo
Jianzong Wang
Ning Cheng
Edward Xiao
Xulong Zhang
Jing Xiao
ViT
152
13
0
28 Jun 2022
Resource-Efficient Separation Transformer
Resource-Efficient Separation TransformerIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Luca Della Libera
Cem Subakan
Mirco Ravanelli
Samuele Cornell
Frédéric Lepoutre
François Grondin
VLM
178
25
0
19 Jun 2022
FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for
  Speech Enhancement
FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech EnhancementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jun Chen
Zehao Wang
Deyi Tuo
Zhiyong Wu
Shiyin Kang
Helen Meng
206
137
0
23 Mar 2022
RemixIT: Continual self-training of speech enhancement models via
  bootstrapped remixing
RemixIT: Continual self-training of speech enhancement models via bootstrapped remixingIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Efthymios Tzinis
Yossi Adi
V. Ithapu
Buye Xu
Paris Smaragdis
Anurag Kumar
CLL
225
65
0
17 Feb 2022
Exploring Self-Attention Mechanisms for Speech Separation
Exploring Self-Attention Mechanisms for Speech SeparationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Cem Subakan
Mirco Ravanelli
Samuele Cornell
François Grondin
Mirko Bronzi
233
37
0
06 Feb 2022
Speech Separation Using an Asynchronous Fully Recurrent Convolutional
  Neural Network
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural NetworkNeural Information Processing Systems (NeurIPS), 2021
Xiaolin Hu
Kai Li
Weiyi Zhang
Yi Luo
Jean-Marie Lemercier
Timo Gerkmann
156
61
0
04 Dec 2021
REAL-M: Towards Speech Separation on Real Mixtures
REAL-M: Towards Speech Separation on Real MixturesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Cem Subakan
Mirco Ravanelli
Samuele Cornell
François Grondin
161
24
0
20 Oct 2021
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in
  High-order Latent Domain
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent DomainIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Zengwei Yao
Wenjie Pei
Fanglin Chen
Guangming Lu
David C. Zhang
192
14
0
10 Oct 2021
Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency
  Domain Features and a Pre-trained Acoustic Model
Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency Domain Features and a Pre-trained Acoustic ModelAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021
Quandong Wang
Junnan Wu
Zhao Yan
Sichong Qian
Liyong Guo
Lichun Fan
Weiji Zhuang
Peng Gao
Yujun Wang
234
0
0
23 Jul 2021
Improving On-Screen Sound Separation for Open-Domain Videos with
  Audio-Visual Self-Attention
Improving On-Screen Sound Separation for Open-Domain Videos with Audio-Visual Self-Attention
Efthymios Tzinis
Scott Wisdom
Tal Remez
J. Hershey
VLM
246
8
0
17 Jun 2021
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech
  Separation
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech SeparationInterspeech (Interspeech), 2021
Jisi Zhang
Catalin Zorila
R. Doddipatla
Jon Barker
140
25
0
15 Jun 2021
Compute and memory efficient universal sound source separation
Compute and memory efficient universal sound source separationJournal of Signal Processing Systems (JSPS), 2021
Efthymios Tzinis
Zhepei Wang
Xilin Jiang
Paris Smaragdis
194
45
0
03 Mar 2021
What's All the FUSS About Free Universal Sound Separation Data?
What's All the FUSS About Free Universal Sound Separation Data?IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Scott Wisdom
Hakan Erdogan
D. Ellis
Romain Serizel
Nicolas Turpault
Eduardo Fonseca
Justin Salamon
Prem Seetharaman
J. Hershey
246
88
0
02 Nov 2020
Unified Gradient Reweighting for Model Biasing with Applications to
  Source Separation
Unified Gradient Reweighting for Model Biasing with Applications to Source SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Efthymios Tzinis
Dimitrios Bralios
Paris Smaragdis
296
1
0
25 Oct 2020
Attention is All You Need in Speech Separation
Attention is All You Need in Speech SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Cem Subakan
Mirco Ravanelli
Samuele Cornell
Mirko Bronzi
Jianyuan Zhong
289
699
0
25 Oct 2020
Sudo rm -rf: Efficient Networks for Universal Audio Source Separation
Sudo rm -rf: Efficient Networks for Universal Audio Source SeparationInternational Workshop on Machine Learning for Signal Processing (MLSP), 2020
Efthymios Tzinis
Zhepei Wang
Paris Smaragdis
230
150
0
14 Jul 2020
Revisiting Representation Learning for Singing Voice Separation with
  Sinkhorn Distances
Revisiting Representation Learning for Singing Voice Separation with Sinkhorn Distances
S. I. Mimilakis
Konstantinos Drossos
G. Schuller
152
2
0
06 Jul 2020
Depthwise Separable Convolutions Versus Recurrent Neural Networks for
  Monaural Singing Voice Separation
Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation
Pyry Pyykkönen
S. I. Mimilakis
Konstantinos Drossos
Maria Sandsten
140
4
0
06 Jul 2020
Asteroid: the PyTorch-based audio source separation toolkit for
  researchers
Asteroid: the PyTorch-based audio source separation toolkit for researchers
Manuel Pariente
Samuele Cornell
Joris Cosentino
S. Sivasankaran
Efthymios Tzinis
...
Juan M. Martín-Donas
David Ditter
Ariel Frank
Antoine Deleforge
Emmanuel Vincent
250
170
0
08 May 2020
Unsupervised Interpretable Representation Learning for Singing Voice
  Separation
Unsupervised Interpretable Representation Learning for Singing Voice SeparationEuropean Signal Processing Conference (EUSIPCO), 2020
S. I. Mimilakis
Konstantinos Drossos
G. Schuller
252
8
0
03 Mar 2020
1