ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1703.06284
  4. Cited By
Multi-talker Speech Separation with Utterance-level Permutation
  Invariant Training of Deep Recurrent Neural Networks

Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks

18 March 2017
Morten Kolbaek
Dong Yu
Zheng-Hua Tan
Jesper Jensen
ArXivPDFHTML

Papers citing "Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks"

50 / 127 papers shown
Title
Listen to Extract: Onset-Prompted Target Speaker Extraction
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Zehao Wang
53
0
0
08 May 2025
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
David Perera
Victor Letzelter
Théo Mariotte
Adrien Cortés
Mickaël Chen
S. Essid
Ga¨el Richard
74
3
0
20 Jan 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
43
0
0
13 Jan 2025
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Jiawen Kang
Lingwei Meng
Mingyu Cui
Yuejiao Wang
Xixin Wu
Xunying Liu
Helen Meng
41
2
0
19 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
Bang Zeng
Ming Li
39
2
0
04 Sep 2024
Serialized Speech Information Guidance with Overlapped Encoding
  Separation for Multi-Speaker Automatic Speech Recognition
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Hao Shi
Yuan Gao
Zhaoheng Ni
Tatsuya Kawahara
34
2
0
01 Sep 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by
  Magnitude Conditioning
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning
Kuan-Hsun Ho
J. Hung
Berlin Chen
42
0
0
04 Mar 2024
End-to-end Online Speaker Diarization with Target Speaker Tracking
End-to-end Online Speaker Diarization with Target Speaker Tracking
Weiqing Wang
Ming Li
39
5
0
12 Oct 2023
On Time Domain Conformer Models for Monaural Speech Separation in Noisy
  Reverberant Acoustic Environments
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
William Ravenscroft
Stefan Goetze
Thomas Hain
28
7
0
09 Oct 2023
Conformer-based Target-Speaker Automatic Speech Recognition for
  Single-Channel Audio
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Yang Zhang
Krishna C. Puvvada
Vitaly Lavrukhin
Boris Ginsburg
38
14
0
09 Aug 2023
Mixture Encoder for Joint Speech Separation and Recognition
Mixture Encoder for Joint Speech Separation and Recognition
Simon Berger
Peter Vieting
Christoph Boeddeker
Ralf Schluter
Reinhold Häb-Umbach
24
6
0
21 Jun 2023
An Efficient Speech Separation Network Based on Recurrent Fusion Dilated
  Convolution and Channel Attention
An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention
Junyu Wang
22
1
0
09 Jun 2023
An Experimental Review of Speaker Diarization methods with application
  to Two-Speaker Conversational Telephone Speech recordings
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
Alessio Brutti
S. Squartini
47
9
0
29 May 2023
A Neural State-Space Model Approach to Efficient Speech Separation
A Neural State-Space Model Approach to Efficient Speech Separation
Chen Chen
Chao-Han Huck Yang
Kai Li
Yuchen Hu
Pin-Jui Ku
Chng Eng Siong
37
11
0
26 May 2023
Multi-channel Speech Separation Using Spatially Selective Deep
  Non-linear Filters
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters
Kristina Tesch
Timo Gerkmann
26
16
0
24 Apr 2023
Beamformer-Guided Target Speaker Extraction
Beamformer-Guided Target Speaker Extraction
Mohamed Elminshawi
Srikanth Raj Chetupalli
Emanuel Habets
23
7
0
15 Mar 2023
Multi-Scale Feature Fusion Transformer Network for End-to-End Single
  Channel Speech Separation
Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Yinhao Xu
Jian Zhou
L. Tao
H. Kwan
30
0
0
14 Dec 2022
MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware
  Beamforming Network for Speech Separation
MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation
Yanjie Fu
Haoran Yin
Meng Ge
Longbiao Wang
Gaoyan Zhang
J. Dang
Chengyun Deng
Fei Wang
CVBM
18
2
0
07 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of
  the art analysis
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
30
21
0
01 Dec 2022
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech
  Separation
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation
Zhongqiu Wang
Samuele Cornell
Shukjae Choi
Younglo Lee
Byeonghak Kim
Shinji Watanabe
38
121
0
22 Nov 2022
Speaker Overlap-aware Neural Diarization for Multi-party Meeting
  Analysis
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Zhihao Du
Shiliang Zhang
Siqi Zheng
Zhijie Yan
24
14
0
18 Nov 2022
Diffusion-based Generative Speech Source Separation
Diffusion-based Generative Speech Source Separation
Robin Scheibler
Youna Ji
Soo-Whan Chung
J. Byun
Soyeon Choe
Min-Seok Choi
DiffM
29
41
0
31 Oct 2022
DiaCorrect: End-to-end error correction for speaker diarization
Jiangyu Han
Yuhang Cao
Heng Lu
Yanhua Long
45
0
0
31 Oct 2022
Deformable Temporal Convolutional Networks for Monaural Noisy
  Reverberant Speech Separation
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation
William Ravenscroft
Stefan Goetze
Thomas Hain
33
11
0
27 Oct 2022
Audio Signal Enhancement with Learning from Positive and Unlabelled Data
Audio Signal Enhancement with Learning from Positive and Unlabelled Data
N. Ito
Masashi Sugiyama
21
7
0
27 Oct 2022
Position tracking of a varying number of sound sources with sliding
  permutation invariant training
Position tracking of a varying number of sound sources with sliding permutation invariant training
David Diaz-Guerra
A. Politis
Tuomas Virtanen
30
5
0
26 Oct 2022
Utterance-by-utterance overlap-aware neural diarization with Graph-PIT
Utterance-by-utterance overlap-aware neural diarization with Graph-PIT
K. Kinoshita
Thilo von Neumann
Marc Delcroix
Christoph Boeddeker
Reinhold Haeb-Umbach
40
4
0
28 Jul 2022
Heterogeneous Separation Consistency Training for Adaptation of
  Unsupervised Speech Separation
Heterogeneous Separation Consistency Training for Adaptation of Unsupervised Speech Separation
Jiangyu Han
Yanhua Long
28
6
0
23 Apr 2022
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation
  System
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation System
M. Z. Ozturk
Chenshu Wu
Beibei Wang
Min Wu
K. Liu
27
20
0
14 Apr 2022
Small Footprint Multi-channel ConvMixer for Keyword Spotting with
  Centroid Based Awareness
Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness
Dianwen Ng
Jing Pang
Yanghua Xiao
Biao Tian
Qiang Fu
Eng Siong Chng
27
2
0
11 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and
  enrollment clues for increased performance and continuous learning
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
22
32
0
08 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and
  Approaches
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
23
16
0
04 Apr 2022
End-to-End Multi-speaker ASR with Independent Vector Analysis
End-to-End Multi-speaker ASR with Independent Vector Analysis
Robin Scheibler
Wangyou Zhang
Xuankai Chang
Shinji Watanabe
Y. Qian
24
2
0
01 Apr 2022
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain
  Target Speaker Extraction
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Zexu Pan
Meng Ge
Haizhou Li
23
17
0
31 Mar 2022
Speaker Extraction with Co-Speech Gestures Cue
Speaker Extraction with Co-Speech Gestures Cue
Zexu Pan
Xinyuan Qian
Haizhou Li
SLR
21
27
0
31 Mar 2022
Effective data screening technique for crowdsourced speech
  intelligibility experiments: Evaluation with IRM-based speech enhancement
Effective data screening technique for crowdsourced speech intelligibility experiments: Evaluation with IRM-based speech enhancement
Ayako Yamamoto
Toshio Irino
S. Araki
Kenichi Arai
A. Ogawa
K. Kinoshita
Tomohiro Nakatani
17
2
0
31 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
26
109
0
14 Mar 2022
Single microphone speaker extraction using unified time-frequency
  Siamese-Unet
Single microphone speaker extraction using unified time-frequency Siamese-Unet
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
30
3
0
06 Mar 2022
Audio-visual speech separation based on joint feature representation
  with cross-modal attention
Audio-visual speech separation based on joint feature representation with cross-modal attention
Jun Xiong
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
Yanni Zhang
25
3
0
05 Mar 2022
Tight integration of neural- and clustering-based diarization through
  deep unfolding of infinite Gaussian mixture model
Tight integration of neural- and clustering-based diarization through deep unfolding of infinite Gaussian mixture model
K. Kinoshita
Marc Delcroix
Tomoharu Iwata
BDL
25
19
0
14 Feb 2022
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation
  Invariant Training
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training
Ertuğ Karamatlı
S. Kırbız
SSL
36
9
0
08 Feb 2022
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech
  Separation
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation
Chenda Li
Lei Yang
Weiqin Wang
Y. Qian
32
25
0
26 Jan 2022
Learning-based personal speech enhancement for teleconferencing by
  exploiting spatial-spectral features
Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features
Yicheng Hsu
Yonghan Lee
M. Bai
22
10
0
10 Dec 2021
Speaker Embedding-aware Neural Diarization for Flexible Number of
  Speakers with Textual Information
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information
Zhihao Du
Shiliang Zhang
Siqi Zheng
Weilong Huang
Ming Lei
BDL
16
1
0
28 Nov 2021
Switching Independent Vector Analysis and Its Extension to Blind and
  Spatially Guided Convolutional Beamforming Algorithms
Switching Independent Vector Analysis and Its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms
Tomohiro Nakatani
Rintaro Ikeshita
K. Kinoshita
H. Sawada
Naoyuki Kamo
S. Araki
33
8
0
20 Nov 2021
Single-channel speech separation using Soft-minimum Permutation
  Invariant Training
Single-channel speech separation using Soft-minimum Permutation Invariant Training
Midia Yousefi
John H. L. Hansen
21
3
0
16 Nov 2021
Reduction of Subjective Listening Effort for TV Broadcast Signals with
  Recurrent Neural Networks
Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks
Nils L. Westhausen
R. Huber
Hannah Baumgartner
Ragini Sinha
J. Rennies
B. Meyer
30
10
0
02 Nov 2021
SA-SDR: A novel loss function for separation of meeting style data
SA-SDR: A novel loss function for separation of meeting style data
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
29
20
0
29 Oct 2021
Continuous Speech Separation with Recurrent Selective Attention Network
Continuous Speech Separation with Recurrent Selective Attention Network
Yixuan Zhang
Zhuo Chen
Jian Wu
Takuya Yoshioka
Peidong Wang
Zhong Meng
Jinyu Li
BDL
27
7
0
28 Oct 2021
123
Next