Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04826
Cited By
v1
v2
v3
v4
v5
v6 (latest)
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
11 October 2018
Quan Wang
Hannah Muckenhirn
K. Wilson
Prashant Sridhar
Zelin Wu
J. Hershey
Rif A. Saurous
Ron J. Weiss
Ye Jia
Ignacio López Moreno
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking"
50 / 193 papers shown
Title
Target Speech Extraction with Conditional Diffusion Model
Naoyuki Kamo
Marc Delcroix
Tomohiro Nakatan
DiffM
65
22
0
08 Aug 2023
Complete and separate: Conditional separation with missing target source attribute completion
Dimitrios Bralios
Efthymios Tzinis
Paris Smaragdis
87
0
0
27 Jul 2023
Audio-Visual Speech Enhancement With Selective Off-Screen Speech Extraction
Tomoya Yoshinaga
Keitaro Tanaka
Shigeo Morishima
63
0
0
10 Jun 2023
End-to-End Joint Target and Non-Target Speakers ASR
Ryo Masumura
Naoki Makishima
Taiga Yamane
Yoshihiko Yamazaki
Saki Mizuno
...
Akihiko Takashima
Satoshi Suzuki
Takafumi Moriya
Nobukatsu Hojo
Atsushi Ando
60
5
0
04 Jun 2023
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition
Wangyou Zhang
Y. Qian
89
11
0
25 May 2023
BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions
Jie Zhang
Qingquan Xu
Qiu-shi Zhu
Zhenhua Ling
70
12
0
17 May 2023
Universal Source Separation with Weakly Labelled Data
Qiuqiang Kong
Kai Chen
Haohe Liu
Xingjian Du
Taylor Berg-Kirkpatrick
Shlomo Dubnov
Mark D. Plumbley
79
22
0
11 May 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Muhammad Usama
Junaid Qadir
165
48
0
21 Mar 2023
Target Sound Extraction with Variable Cross-modality Clues
Chenda Li
Yao Qian
Zhuo Chen
Dongmei Wang
Takuya Yoshioka
Shujie Liu
Y. Qian
Michael Zeng
VLM
68
14
0
15 Mar 2023
Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network
Cong Han
N. Mesgarani
61
4
0
13 Mar 2023
X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker Confusion
Kai Liu
Z.C. Du
Xucheng Wan
Huan Zhou
98
24
0
09 Mar 2023
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings
Christoph Boeddeker
Aswin Shanmugam Subramanian
Gordon Wichern
Reinhold Haeb-Umbach
Jonathan Le Roux
93
24
0
07 Mar 2023
A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement
Zhepei Wang
Ritwik Giri
Devansh P. Shah
J. Valin
Mike Goodwin
Paris Smaragdis
67
9
0
23 Feb 2023
Neural Target Speech Extraction: An Overview
Kateřina Žmolíková
Marc Delcroix
Tsubasa Ochiai
K. Kinoshita
JanHonza'' vCernocký
Dong Yu
70
95
0
31 Jan 2023
Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings
Kai Liu
Xucheng Wan
Z.C. Du
Huan Zhou
VLM
51
1
0
16 Jan 2023
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
117
22
0
01 Dec 2022
Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence
Yicheng Hsu
Yonghan Lee
M. Bai
45
3
0
16 Nov 2022
The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement
Anastasia Kuznetsova
Aswin Sivaraman
Minje Kim
54
3
0
14 Nov 2022
Optimal Condition Training for Target Source Separation
Efthymios Tzinis
Gordon Wichern
Paris Smaragdis
Jonathan Le Roux
77
5
0
11 Nov 2022
Cross-Attention is all you need: Real-Time Streaming Transformers for Personalised Speech Enhancement
Shucong Zhang
Malcolm Chadwick
Alberto Gil C. P. Ramos
S. Bhattacharya
54
5
0
08 Nov 2022
Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation
Sefik Emre Eskimez
Takuya Yoshioka
Alex Ju
M. Tang
Tanel Pärnamaa
Huaming Wang
63
7
0
04 Nov 2022
Spatially Selective Deep Non-linear Filters for Speaker Extraction
Kristina Tesch
Timo Gerkmann
66
17
0
04 Nov 2022
Real-Time Target Sound Extraction
Bandhav Veluri
Justin Chan
Malek Itani
Tuochao Chen
Takuya Yoshioka
Shyamnath Gollakota
112
33
0
04 Nov 2022
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Zili Huang
Desh Raj
Leibny Paola García-Perera
Sanjeev Khudanpur
155
29
0
01 Nov 2022
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting
Zexu Pan
Wupeng Wang
Marvin Borsdorf
Haizhou Li
85
12
0
31 Oct 2022
Hierarchical speaker representation for target speaker extraction
Shulin He
Huaiwen Zhang
Wei Rao
Kanghao Zhang
Yukai Ju
Yang-Rui Yang
Xueliang Zhang
60
7
0
28 Oct 2022
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation
Xiaoyu Liu
Xu Li
Joan Serrà
74
9
0
23 Oct 2022
VCSE: Time-Domain Visual-Contextual Speaker Extraction Network
Junjie Li
Meng Ge
Zexu Pan
Longbiao Wang
Jianwu Dang
55
10
0
09 Oct 2022
C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification
Chunlei Zhang
Dong Yu
117
20
0
15 Aug 2022
Attention and DCT based Global Context Modeling for Text-independent Speaker Recognition
Wei Xia
John H. L. Hansen
58
4
0
04 Aug 2022
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Efthymios Tzinis
Scott Wisdom
Tal Remez
J. Hershey
119
30
0
20 Jul 2022
Multi-channel target speech enhancement based on ERB-scaled spatial coherence features
Yicheng Hsu
Yonghan Lee
M. Bai
55
1
0
17 Jul 2022
NEC: Speaker Selective Cancellation via Neural Enhanced Ultrasound Shadowing
Hanqing Guo
Chenning Li
Lingkun Li
Zhichao Cao
Qiben Yan
Li Xiao
13
3
0
12 Jul 2022
Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion
Ahmad Aloradi
Wolfgang Mack
Mohamed Elminshawi
Emanuel Habets
63
5
0
28 Jun 2022
Semi-supervised Time Domain Target Speaker Extraction with Attention
Zhepei Wang
Ritwik Giri
Shrikant Venkataramani
Umut Isik
J. Valin
Paris Smaragdis
Mike Goodwin
A. Krishnaswamy
59
7
0
18 Jun 2022
Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios
Bang Zeng
Weiqing Wang
Yuanyuan Bao
Ming Li
57
0
0
17 Jun 2022
Personalized Acoustic Echo Cancellation for Full-duplex Communications
Shimin Zhang
Ziteng Wang
Yukai Ju
Yihui Fu
Yueyue Na
Q. Fu
Linfu Xie
41
5
0
30 May 2022
NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Meng Yu
Yong-mei Xu
Chunlei Zhang
Shizhong Zhang
Dong Yu
50
11
0
20 May 2022
Streaming Noise Context Aware Enhancement For Automatic Speech Recognition in Multi-Talker Environments
Joseph Peter Caroselli
A. Narayanan
Yiteng Huang
27
1
0
17 May 2022
Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
Bowen Shi
Abdel-rahman Mohamed
Wei-Ning Hsu
SSL
69
18
0
15 May 2022
Cleanformer: A multichannel array configuration-invariant neural enhancement frontend for ASR in smart speakers
Joseph Peter Caroselli
A. Narayanan
N. Howard
Tom O'Malley
51
5
0
25 Apr 2022
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction
Zifeng Zhao
Rongzhi Gu
Dongchao Yang
Jinchuan Tian
Yuexian Zou
59
2
0
15 Apr 2022
Text-Driven Separation of Arbitrary Sounds
Kevin Kilgour
Beat Gfeller
Qingqing Huang
A. Jansen
Scott Wisdom
Marco Tagliasacchi
100
34
0
12 Apr 2022
Listen only to me! How well can target speech extraction handle false alarms?
Marc Delcroix
K. Kinoshita
Tsubasa Ochiai
Kateřina Žmolíková
Hiroshi Sato
Tomohiro Nakatani
71
15
0
11 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
83
34
0
08 Apr 2022
Heterogeneous Target Speech Separation
Hyunjae Cho
Wonbin Jung
Junhyeok Lee
Paris Smaragdis
Sanghyun Woo
92
26
0
07 Apr 2022
RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection
Dongchao Yang
Helin Wang
Zhongjie Ye
Yuexian Zou
Wenwu Wang
57
0
0
05 Apr 2022
Improving Target Sound Extraction with Timestamp Information
Helin Wang
Dongchao Yang
Chao Weng
Jianwei Yu
Yuexian Zou
64
10
0
02 Apr 2022
Speaker Extraction with Co-Speech Gestures Cue
Zexu Pan
Xinyuan Qian
Haizhou Li
SLR
66
29
0
31 Mar 2022
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings
Fan Yu
Zhihao Du
Shiliang Zhang
Yuxiao Lin
Linfu Xie
42
15
0
31 Mar 2022
Previous
1
2
3
4
Next