ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.08378
  4. Cited By
Improving speaker discrimination of target speech extraction with
  time-domain SpeakerBeam

Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
23 January 2020
Marc Delcroix
Tsubasa Ochiai
Kateřina Žmolíková
K. Kinoshita
Naohiro Tawara
Tomohiro Nakatani
S. Araki
ArXiv (abs)PDFHTML

Papers citing "Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam"

50 / 71 papers shown
Binaural Target Speaker Extraction using Individualized HRTF
Binaural Target Speaker Extraction using Individualized HRTF
Yoav Ellinson
Sharon Gannot
290
1
0
25 Jul 2025
Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
Zixuan Li
Xueliang Zhang
Lei Miao
Zhipeng Yan
Ying Sun
Chong Zhu
230
0
0
28 May 2025
Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement
Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement
Ziling Huang
Haixin Guan
Yanhua Long
267
1
0
18 May 2025
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Junyi Peng
Takanori Ashihara
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
ELM
327
3
0
10 May 2025
Listen to Extract: Onset-Prompted Target Speaker Extraction
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Zehao Wang
395
3
0
08 May 2025
Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction
Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech ExtractionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Minsu Kim
Rodrigo Mira
Honglie Chen
Stavros Petridis
Maja Pantic
347
3
0
13 Mar 2025
End-to-End Multi-Microphone Speaker Extraction Using Relative Transfer Functions
End-to-End Multi-Microphone Speaker Extraction Using Relative Transfer Functions
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
249
5
0
10 Feb 2025
SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts AggregationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Ziling Huang
Haixin Guan
Haoran Wei
Yanhua Long
172
5
0
20 Jan 2025
Investigation of Speaker Representation for Target-Speaker Speech
  Processing
Investigation of Speaker Representation for Target-Speaker Speech ProcessingSpoken Language Technology Workshop (SLT), 2024
Takanori Ashihara
Takafumi Moriya
Shota Horiguchi
Junyi Peng
Tsubasa Ochiai
Marc Delcroix
Kohei Matsuura
Hiroshi Sato
267
2
0
15 Oct 2024
Two-stage Framework for Robust Speech Emotion Recognition Using Target
  Speaker Extraction in Human Speech Noise Conditions
Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise ConditionsAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2024
Jinyi Mi
Xiaohan Shi
D. Ma
Jiajun He
Takuya Fujimura
Tomoki Toda
242
5
0
29 Sep 2024
Generative Speech Foundation Model Pretraining for High-Quality Speech
  Extraction and Restoration
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
Pin-Jui Ku
Alexander H. Liu
Roman Korostik
Sung-Feng Huang
Szu-Wei Fu
Ante Jukić
329
21
0
24 Sep 2024
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target
  Speaker Extraction
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker ExtractionInterspeech (Interspeech), 2024
Shuai Wang
Ke Zhang
Shaoxiong Lin
Junjie Li
Xuefei Wang
Meng Ge
Jianwei Yu
Yanmin Qian
Haizhou Li
235
22
0
24 Sep 2024
On the effectiveness of enrollment speech augmentation for Target
  Speaker Extraction
On the effectiveness of enrollment speech augmentation for Target Speaker ExtractionSpoken Language Technology Workshop (SLT), 2024
Junjie Li
Ke Zhang
Shuai Wang
Haizhou Li
Man-Wai Mak
Kong Aik Lee
179
12
0
15 Sep 2024
DENSE: Dynamic Embedding Causal Target Speech Extraction
DENSE: Dynamic Embedding Causal Target Speech Extraction
Yiwen Wang
Zeyu Yuan
Xihong Wu
234
1
0
10 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
USEF-TSE: Universal Speaker Embedding Free Target Speaker ExtractionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024
Bang Zeng
Ming Li
492
21
0
04 Sep 2024
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight
  Conv-TasNet and State Space Modeling
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
Hiroshi Sato
Takafumi Moriya
Masato Mimura
Shota Horiguchi
Tsubasa Ochiai
Takanori Ashihara
Atsushi Ando
Kentaro Shinayama
Marc Delcroix
255
14
0
01 Jul 2024
Target Speech Extraction with Pre-trained Self-supervised Learning
  Models
Target Speech Extraction with Pre-trained Self-supervised Learning Models
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
284
19
0
17 Feb 2024
Probing Self-supervised Learning Models with Target Speech Extraction
Probing Self-supervised Learning Models with Target Speech Extraction
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Takanori Ashihara
Shoko Araki
J. Černocký
307
6
0
17 Feb 2024
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible
  recipes, self-supervised front-ends, and off-the-shelf models
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Jee-weon Jung
Wangyou Zhang
Jiatong Shi
Zakaria Aldeneh
Takuya Higuchi
B. Theobald
Ahmed Hussen Abdelaziz
Shinji Watanabe
505
49
0
30 Jan 2024
Spatial-Temporal Activity-Informed Diarization and Separation
Spatial-Temporal Activity-Informed Diarization and Separation
Yicheng Hsu
Ssuhan Chen
Mingsian R. Bai
259
0
0
30 Jan 2024
3S-TSE: Efficient Three-Stage Target Speaker Extraction for Real-Time
  and Low-Resource Applications
3S-TSE: Efficient Three-Stage Target Speaker Extraction for Real-Time and Low-Resource Applications
Shulin He
Jinjiang Liu
Hao Li
Yang-Rui Yang
Fei Chen
Xueliang Zhang
296
9
0
18 Dec 2023
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker
  Extraction
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker ExtractionIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2023
Xiang Hao
Jibin Wu
Jianwei Yu
Chenglin Xu
Kay Chen Tan
441
19
0
11 Oct 2023
The second multi-channel multi-party meeting transcription challenge
  (M2MeT) 2.0): A benchmark for speaker-attributed ASR
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASRAutomatic Speech Recognition & Understanding (ASRU), 2023
Yuhao Liang
Mohan Shi
Fan Yu
Yangze Li
Shiliang Zhang
...
Jian Wu
Zhuo Chen
Kong Aik Lee
Zhijie Yan
Hui Bu
317
10
0
24 Sep 2023
Target Speech Extraction with Conditional Diffusion Model
Target Speech Extraction with Conditional Diffusion ModelInterspeech (Interspeech), 2023
Naoyuki Kamo
Marc Delcroix
Tomohiro Nakatan
DiffM
281
30
0
08 Aug 2023
MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale
  Interfusion and Conditional Speaker Modulation
MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker ModulationInterspeech (Interspeech), 2023
Jun Chen
Wei Rao
Zehao Wang
Jiuxin Lin
Yukai Ju
Shulin He
Yannan Wang
Zhiyong Wu
296
21
0
28 Jun 2023
Beamformer-Guided Target Speaker Extraction
Beamformer-Guided Target Speaker ExtractionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Mohamed Elminshawi
Srikanth Raj Chetupalli
Emanuel Habets
179
12
0
15 Mar 2023
Target Sound Extraction with Variable Cross-modality Clues
Target Sound Extraction with Variable Cross-modality CluesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chenda Li
Yao Qian
Zhuo Chen
Dongmei Wang
Takuya Yoshioka
Shujie Liu
Y. Qian
Michael Zeng
VLM
205
20
0
15 Mar 2023
Online Binaural Speech Separation of Moving Speakers With a Wavesplit
  Network
Online Binaural Speech Separation of Moving Speakers With a Wavesplit NetworkIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Cong Han
N. Mesgarani
177
7
0
13 Mar 2023
A two-stage speaker extraction algorithm under adverse acoustic
  conditions using a single-microphone
A two-stage speaker extraction algorithm under adverse acoustic conditions using a single-microphoneEuropean Signal Processing Conference (EUSIPCO), 2023
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
392
3
0
13 Mar 2023
X-SepFormer: End-to-end Speaker Extraction Network with Explicit
  Optimization on Speaker Confusion
X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker ConfusionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Kai Liu
Z.C. Du
Xucheng Wan
Huan Zhou
296
45
0
09 Mar 2023
A Framework for Unified Real-time Personalized and Non-Personalized
  Speech Enhancement
A Framework for Unified Real-time Personalized and Non-Personalized Speech EnhancementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Zhepei Wang
Ritwik Giri
Devansh P. Shah
J. Valin
Mike Goodwin
Paris Smaragdis
183
10
0
23 Feb 2023
Improving Target Speaker Extraction with Sparse LDA-transformed Speaker
  Embeddings
Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings
Kai Liu
Xucheng Wan
Z.C. Du
Huan Zhou
VLM
179
1
0
16 Jan 2023
Array Configuration-Agnostic Personalized Speech Enhancement using
  Long-Short-Term Spatial Coherence
Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial CoherenceJournal of the Acoustical Society of America (JASA), 2022
Yicheng Hsu
Yonghan Lee
M. Bai
270
5
0
16 Nov 2022
Breaking the trade-off in personalized speech enhancement with
  cross-task knowledge distillation
Breaking the trade-off in personalized speech enhancement with cross-task knowledge distillationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
H. Taherian
Sefik Emre Eskimez
Takuya Yoshioka
193
1
0
05 Nov 2022
Real-Time Joint Personalized Speech Enhancement and Acoustic Echo
  Cancellation
Real-Time Joint Personalized Speech Enhancement and Acoustic Echo CancellationInterspeech (Interspeech), 2022
Sefik Emre Eskimez
Takuya Yoshioka
Alex Ju
M. Tang
Tanel Pärnamaa
Huaming Wang
277
7
0
04 Nov 2022
Hierarchical speaker representation for target speaker extraction
Hierarchical speaker representation for target speaker extractionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Shulin He
Huaiwen Zhang
Wei Rao
Kanghao Zhang
Yukai Ju
Yang-Rui Yang
Xueliang Zhang
358
16
0
28 Oct 2022
Deformable Temporal Convolutional Networks for Monaural Noisy
  Reverberant Speech Separation
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
William Ravenscroft
Stefan Goetze
Thomas Hain
404
13
0
27 Oct 2022
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker
  Embeddings for Target Speaker Separation
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xiaoyu Liu
Xu Li
Joan Serrà
224
10
0
23 Oct 2022
Streaming Target-Speaker ASR with Neural Transducer
Streaming Target-Speaker ASR with Neural TransducerInterspeech (Interspeech), 2022
Takafumi Moriya
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
T. Shinozaki
382
27
0
09 Sep 2022
Analysis of impact of emotions on target speech extraction and speech
  separation
Analysis of impact of emotions on target speech extraction and speech separationInternational Workshop on Acoustic Signal Enhancement (IWAENC), 2022
Jan vSvec
Katevrina vZmolíková
M. Kocour
Marc Delcroix
Tsubasa Ochiai
Ladislav Movsner
JanHonza'' vCernocký
203
7
0
15 Aug 2022
Multi-channel target speech enhancement based on ERB-scaled spatial
  coherence features
Multi-channel target speech enhancement based on ERB-scaled spatial coherence features
Yicheng Hsu
Yonghan Lee
M. Bai
175
3
0
17 Jul 2022
Semi-supervised Time Domain Target Speaker Extraction with Attention
Semi-supervised Time Domain Target Speaker Extraction with Attention
Zhepei Wang
Ritwik Giri
Shrikant Venkataramani
Umut Isik
J. Valin
Paris Smaragdis
Mike Goodwin
A. Krishnaswamy
191
8
0
18 Jun 2022
Strategies to Improve Robustness of Target Speech Extraction to
  Enrollment Variations
Strategies to Improve Robustness of Target Speech Extraction to Enrollment VariationsInterspeech (Interspeech), 2022
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Takafumi Moriya
Naoki Makishima
Mana Ihori
Tomohiro Tanaka
Ryo Masumura
149
6
0
16 Jun 2022
Personalized Acoustic Echo Cancellation for Full-duplex Communications
Personalized Acoustic Echo Cancellation for Full-duplex CommunicationsInterspeech (Interspeech), 2022
Shimin Zhang
Ziteng Wang
Yukai Ju
Yihui Fu
Yueyue Na
Q. Fu
Linfu Xie
302
5
0
30 May 2022
Speaker Reinforcement Using Target Source Extraction for Robust
  Automatic Speech Recognition
Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Catalin Zorila
R. Doddipatla
256
11
0
09 May 2022
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker
  Extraction
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker ExtractionInterspeech (Interspeech), 2022
Zifeng Zhao
Rongzhi Gu
Dongchao Yang
Jinchuan Tian
Yuexian Zou
173
2
0
15 Apr 2022
Listen only to me! How well can target speech extraction handle false
  alarms?
Listen only to me! How well can target speech extraction handle false alarms?Interspeech (Interspeech), 2022
Marc Delcroix
K. Kinoshita
Tsubasa Ochiai
Kateřina Žmolíková
Hiroshi Sato
Tomohiro Nakatani
228
17
0
11 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and
  enrollment clues for increased performance and continuous learning
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learningIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
402
46
0
08 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and
  Approaches
Target Confusion in End-to-end Speaker Extraction: Analysis and ApproachesInterspeech (Interspeech), 2022
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
215
29
0
04 Apr 2022
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain
  Target Speaker Extraction
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker ExtractionInterspeech (Interspeech), 2022
Zexu Pan
Meng Ge
Haizhou Li
316
26
0
31 Mar 2022
12
Next
Page 1 of 2