ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04826
  4. Cited By
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned
  Spectrogram Masking
v1v2v3v4v5v6 (latest)

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

11 October 2018
Quan Wang
Hannah Muckenhirn
K. Wilson
Prashant Sridhar
Zelin Wu
J. Hershey
Rif A. Saurous
Ron J. Weiss
Ye Jia
Ignacio López Moreno
ArXiv (abs)PDFHTML

Papers citing "VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking"

50 / 193 papers shown
Title
Separate What You Describe: Language-Queried Audio Source Separation
Separate What You Describe: Language-Queried Audio Source Separation
Xubo Liu
Haohe Liu
Qiuqiang Kong
Xinhao Mei
Jinzheng Zhao
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
100
70
0
28 Mar 2022
Locate This, Not That: Class-Conditioned Sound Event DOA Estimation
Locate This, Not That: Class-Conditioned Sound Event DOA Estimation
Olga Slizovskaia
Gordon Wichern
Zhong-Qiu Wang
Jonathan Le Roux
59
4
0
08 Mar 2022
Single microphone speaker extraction using unified time-frequency
  Siamese-Unet
Single microphone speaker extraction using unified time-frequency Siamese-Unet
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
49
3
0
06 Mar 2022
Closing the Gap between Single-User and Multi-User VoiceFilter-Lite
Closing the Gap between Single-User and Multi-User VoiceFilter-Lite
R. Rikhye
Quan Wang
Qiao Liang
Yanzhang He
Ian McGraw
VLM
63
7
0
24 Feb 2022
SpeechPainter: Text-conditioned Speech Inpainting
SpeechPainter: Text-conditioned Speech Inpainting
Zalan Borsos
Matthew Sharifi
Marco Tagliasacchi
93
28
0
15 Feb 2022
New Insights on Target Speaker Extraction
New Insights on Target Speaker Extraction
Mohamed Elminshawi
Wolfgang Mack
Srikanth Raj Chetupalli
Soumitro Chakrabarty
Emanuel Habets
58
18
0
01 Feb 2022
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech
  Separation
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation
Chenda Li
Lei Yang
Weiqin Wang
Y. Qian
83
27
0
26 Jan 2022
Detect what you want: Target Sound Detection
Detect what you want: Target Sound Detection
Dongchao Yang
Helin Wang
Yuexian Zou
Fan Cui
Chao Weng
95
7
0
19 Dec 2021
Directed Speech Separation for Automatic Speech Recognition of Long Form
  Conversational Speech
Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech
Rohit Paturi
S. Srinivasan
Katrin Kirchhoff
Daniel Garcia-Romero
67
9
0
10 Dec 2021
Learning-based personal speech enhancement for teleconferencing by
  exploiting spatial-spectral features
Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features
Yicheng Hsu
Yonghan Lee
M. Bai
52
10
0
10 Dec 2021
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature
Yiwen Shao
Shi-Xiong Zhang
Dong Yu
88
15
0
22 Nov 2021
A Conformer-based ASR Frontend for Joint Acoustic Echo Cancellation,
  Speech Enhancement and Speech Separation
A Conformer-based ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement and Speech Separation
Tom O'Malley
A. Narayanan
Quan Wang
Alex Park
James Walker
N. Howard
59
28
0
18 Nov 2021
LiMuSE: Lightweight Multi-modal Speaker Extraction
LiMuSE: Lightweight Multi-modal Speaker Extraction
Qinghua Liu
Yating Huang
Yunzhe Hao
Jiaming Xu
Bo Xu
67
6
0
07 Nov 2021
Target Speech Extraction: Independent Vector Extraction Guided by
  Supervised Speaker Identification
Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification
J. Málek
Jakub Janský
Zbyněk Koldovský
Tomás Kounovský
Jaroslav Cmejla
J. Zdánský
50
10
0
05 Nov 2021
Cross-attention conformer for context modeling in speech enhancement for
  ASR
Cross-attention conformer for context modeling in speech enhancement for ASR
A. Narayanan
Chung-Cheng Chiu
Tom O'Malley
Quan Wang
Yanzhang He
63
14
0
30 Oct 2021
One model to enhance them all: array geometry agnostic multi-channel
  personalized speech enhancement
One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
H. Taherian
Sefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
Zhuo Chen
Xuedong Huang
56
21
0
20 Oct 2021
Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Sefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
Xiaofei Wang
Zhuo Chen
Xuedong Huang
87
62
0
18 Oct 2021
Similarity-and-Independence-Aware Beamformer with Iterative Casting and
  Boost Start for Target Source Extraction Using Reference
Similarity-and-Independence-Aware Beamformer with Iterative Casting and Boost Start for Target Source Extraction Using Reference
Atsuo Hiroe
37
4
0
18 Oct 2021
Controllable Multichannel Speech Dereverberation based on Deep Neural
  Networks
Controllable Multichannel Speech Dereverberation based on Deep Neural Networks
Ziteng Wang
Yueyue Na
Biao Tian
Q. Fu
58
0
0
16 Oct 2021
USEV: Universal Speaker Extraction with Visual Cue
USEV: Universal Speaker Extraction with Visual Cue
Zexu Pan
Meng Ge
Haizhou Li
70
44
0
30 Sep 2021
BeamTransformer: Microphone Array-based Overlapping Speech Detection
BeamTransformer: Microphone Array-based Overlapping Speech Detection
Siqi Zheng
Shiliang Zhang
Weilong Huang
Qian Chen
Hongbin Suo
Ming Lei
Jinwei Feng
Zhijie Yan
75
8
0
09 Sep 2021
Multi-user VoiceFilter-Lite via Attentive Speaker Embedding
Multi-user VoiceFilter-Lite via Attentive Speaker Embedding
R. Rikhye
Quan Wang
Qiao Liang
Yanzhang He
Ian McGraw
62
8
0
02 Jul 2021
Improving On-Screen Sound Separation for Open-Domain Videos with
  Audio-Visual Self-Attention
Improving On-Screen Sound Separation for Open-Domain Videos with Audio-Visual Self-Attention
Efthymios Tzinis
Scott Wisdom
Tal Remez
J. Hershey
VLM
81
8
0
17 Jun 2021
Few-shot learning of new sound classes for target sound extraction
Few-shot learning of new sound classes for target sound extraction
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
S. Araki
VLM
58
11
0
14 Jun 2021
WASE: Learning When to Attend for Speaker Extraction in Cocktail Party
  Environments
WASE: Learning When to Attend for Speaker Extraction in Cocktail Party Environments
Yunzhe Hao
Jiaming Xu
Peng Zhang
Bo Xu
32
17
0
13 Jun 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker
  Detection in the Wild
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
90
46
0
07 Jun 2021
Lightweight Dual-channel Target Speaker Separation for Mobile Voice
  Communication
Lightweight Dual-channel Target Speaker Separation for Mobile Voice Communication
Yuanyuan Bao
Yanze Xu
Na Xu
Wenjing Yang
Hongfeng Li
Shicong Li
Y. Jia
Fei Xiang
Jincheng He
Ming Li
87
1
0
05 Jun 2021
EchoFilter: End-to-End Neural Network for Acoustic Echo Cancellation
EchoFilter: End-to-End Neural Network for Acoustic Echo Cancellation
Lu Ma
Song Yang
Y. Gong
Xintian Wang
Zhongqin Wu
36
12
0
31 May 2021
Zero-Shot Personalized Speech Enhancement through Speaker-Informed Model
  Selection
Zero-Shot Personalized Speech Enhancement through Speaker-Informed Model Selection
Aswin Sivaraman
Minje Kim
62
9
0
08 May 2021
AvaTr: One-Shot Speaker Extraction with Transformers
AvaTr: One-Shot Speaker Extraction with Transformers
S. Hu
Md Rifat Arefin
V. Nguyen
Alish Dipani
Xaq Pitkow
A. Tolias
64
4
0
03 May 2021
Personalized Keyphrase Detection using Speaker and Environment
  Information
Personalized Keyphrase Detection using Speaker and Environment Information
R. Rikhye
Quan Wang
Qiao Liang
Yanzhang He
Ding Zhao
Yiteng Huang
Huang
A. Narayanan
Ian McGraw
52
11
0
28 Apr 2021
Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech
  Separation in Complex Domain
Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation in Complex Domain
Rongzhi Gu
Shi-Xiong Zhang
Yuexian Zou
Dong Yu
70
34
0
26 Apr 2021
Personalized Speech Enhancement through Self-Supervised Data
  Augmentation and Purification
Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification
Aswin Sivaraman
Sunwoo Kim
Minje Kim
100
23
0
05 Apr 2021
Efficient Personalized Speech Enhancement through Self-Supervised
  Learning
Efficient Personalized Speech Enhancement through Self-Supervised Learning
Aswin Sivaraman
Minje Kim
67
20
0
05 Apr 2021
Target Speaker Verification with Selective Auditory Attention for Single
  and Multi-talker Speech
Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech
Chenglin Xu
Wei Rao
Jibin Wu
Haizhou Li
68
32
0
30 Mar 2021
Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual
  Speech Separation
Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation
Jiyoung Lee
Soo-Whan Chung
Sunok Kim
Hong-Goo Kang
Kwanghoon Sohn
59
51
0
25 Mar 2021
A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with
  Background Music
A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music
Hanbin Bae
Jaesung Bae
Young-Sun Joo
Young-Ik Kim
Hoon-Young Cho
21
2
0
04 Mar 2021
Tune-In: Training Under Negative Environments with Interference for
  Attention Networks Simulating Cocktail Party Effect
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect
Jun Wang
Max W. Y. Lam
Dan Su
Dong Yu
53
6
0
02 Mar 2021
Speech Enhancement Using Multi-Stage Self-Attentive Temporal
  Convolutional Networks
Speech Enhancement Using Multi-Stage Self-Attentive Temporal Convolutional Networks
Ju Lin
A. Wijngaarden
Kuang-Ching Wang
M. C. Smith
78
51
0
24 Feb 2021
Dual-Path Modeling for Long Recording Speech Separation in Meetings
Dual-Path Modeling for Long Recording Speech Separation in Meetings
Chenda Li
Zhuo Chen
Yi Luo
Cong Han
Tianyan Zhou
K. Kinoshita
Marc Delcroix
Shinji Watanabe
Y. Qian
41
10
0
23 Feb 2021
Speaker and Direction Inferred Dual-channel Speech Separation
Speaker and Direction Inferred Dual-channel Speech Separation
Chenxing Li
Jiaming Xu
N. Mesgarani
Bo Xu
34
8
0
08 Feb 2021
Time-Domain Speech Extraction with Spatial Information and Multi Speaker
  Conditioning Mechanism
Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism
Jisi Zhang
Catalin Zorila
R. Doddipatla
Jon Barker
44
13
0
07 Feb 2021
MAAS: Multi-modal Assignation for Active Speaker Detection
MAAS: Multi-modal Assignation for Active Speaker Detection
Juan Carlos León Alcázar
Fabian Caba Heilbron
Ali K. Thabet
Guohao Li
128
52
0
11 Jan 2021
Continuous Speech Separation Using Speaker Inventory for Long
  Multi-talker Recording
Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording
Cong Han
Yi Luo
Chenda Li
Tianyan Zhou
K. Kinoshita
...
Marc Delcroix
Hakan Erdogan
J. Hershey
N. Mesgarani
Zhuo Chen
58
8
0
17 Dec 2020
Self-supervised Text-independent Speaker Verification using Prototypical
  Momentum Contrastive Learning
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning
Wei Xia
Chunlei Zhang
Chao Weng
Meng Yu
Dong Yu
SSL
64
80
0
13 Dec 2020
Deep Ad-hoc Beamforming Based on Speaker Extraction for Target-Dependent
  Speech Separation
Deep Ad-hoc Beamforming Based on Speaker Extraction for Target-Dependent Speech Separation
Ziye Yang
Shanzheng Guan
Xiao-Lei Zhang
32
14
0
01 Dec 2020
Improving RNN Transducer With Target Speaker Extraction and Neural
  Uncertainty Estimation
Improving RNN Transducer With Target Speaker Extraction and Neural Uncertainty Estimation
Jiatong Shi
Chunlei Zhang
Chao Weng
Shinji Watanabe
Meng Yu
Dong Yu
61
12
0
26 Nov 2020
Multi-stage Speaker Extraction with Utterance and Frame-Level Reference
  Signals
Multi-stage Speaker Extraction with Utterance and Frame-Level Reference Signals
Meng Ge
Chenglin Xu
Longbiao Wang
Chng Eng Siong
Jianwu Dang
Haizhou Li
36
44
0
19 Nov 2020
Rethinking the Separation Layers in Speech Separation Networks
Rethinking the Separation Layers in Speech Separation Networks
Yi Luo
Zhuo Chen
Cong Han
Chenda Li
Tianyan Zhou
N. Mesgarani
36
10
0
17 Nov 2020
Informed Source Extraction With Application to Acoustic Echo Reduction
Informed Source Extraction With Application to Acoustic Echo Reduction
Mohamed Elminshawi
Wolfgang Mack
Emanuel Habets
50
2
0
09 Nov 2020
Previous
1234
Next