Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.06284
Cited By
Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks
18 March 2017
Morten Kolbaek
Dong Yu
Zheng-Hua Tan
Jesper Jensen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks"
50 / 127 papers shown
Title
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Zehao Wang
53
0
0
08 May 2025
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
David Perera
Victor Letzelter
Théo Mariotte
Adrien Cortés
Mickaël Chen
S. Essid
Ga¨el Richard
74
3
0
20 Jan 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
43
0
0
13 Jan 2025
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Jiawen Kang
Lingwei Meng
Mingyu Cui
Yuejiao Wang
Xixin Wu
Xunying Liu
Helen Meng
41
2
0
19 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
Bang Zeng
Ming Li
39
2
0
04 Sep 2024
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Hao Shi
Yuan Gao
Zhaoheng Ni
Tatsuya Kawahara
34
2
0
01 Sep 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning
Kuan-Hsun Ho
J. Hung
Berlin Chen
42
0
0
04 Mar 2024
End-to-end Online Speaker Diarization with Target Speaker Tracking
Weiqing Wang
Ming Li
39
5
0
12 Oct 2023
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
William Ravenscroft
Stefan Goetze
Thomas Hain
28
7
0
09 Oct 2023
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Yang Zhang
Krishna C. Puvvada
Vitaly Lavrukhin
Boris Ginsburg
38
14
0
09 Aug 2023
Mixture Encoder for Joint Speech Separation and Recognition
Simon Berger
Peter Vieting
Christoph Boeddeker
Ralf Schluter
Reinhold Häb-Umbach
24
6
0
21 Jun 2023
An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention
Junyu Wang
22
1
0
09 Jun 2023
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
Alessio Brutti
S. Squartini
47
9
0
29 May 2023
A Neural State-Space Model Approach to Efficient Speech Separation
Chen Chen
Chao-Han Huck Yang
Kai Li
Yuchen Hu
Pin-Jui Ku
Chng Eng Siong
37
11
0
26 May 2023
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters
Kristina Tesch
Timo Gerkmann
26
16
0
24 Apr 2023
Beamformer-Guided Target Speaker Extraction
Mohamed Elminshawi
Srikanth Raj Chetupalli
Emanuel Habets
23
7
0
15 Mar 2023
Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Yinhao Xu
Jian Zhou
L. Tao
H. Kwan
30
0
0
14 Dec 2022
MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation
Yanjie Fu
Haoran Yin
Meng Ge
Longbiao Wang
Gaoyan Zhang
J. Dang
Chengyun Deng
Fei Wang
CVBM
18
2
0
07 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
30
21
0
01 Dec 2022
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation
Zhongqiu Wang
Samuele Cornell
Shukjae Choi
Younglo Lee
Byeonghak Kim
Shinji Watanabe
38
121
0
22 Nov 2022
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Zhihao Du
Shiliang Zhang
Siqi Zheng
Zhijie Yan
24
14
0
18 Nov 2022
Diffusion-based Generative Speech Source Separation
Robin Scheibler
Youna Ji
Soo-Whan Chung
J. Byun
Soyeon Choe
Min-Seok Choi
DiffM
29
41
0
31 Oct 2022
DiaCorrect: End-to-end error correction for speaker diarization
Jiangyu Han
Yuhang Cao
Heng Lu
Yanhua Long
45
0
0
31 Oct 2022
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation
William Ravenscroft
Stefan Goetze
Thomas Hain
33
11
0
27 Oct 2022
Audio Signal Enhancement with Learning from Positive and Unlabelled Data
N. Ito
Masashi Sugiyama
21
7
0
27 Oct 2022
Position tracking of a varying number of sound sources with sliding permutation invariant training
David Diaz-Guerra
A. Politis
Tuomas Virtanen
30
5
0
26 Oct 2022
Utterance-by-utterance overlap-aware neural diarization with Graph-PIT
K. Kinoshita
Thilo von Neumann
Marc Delcroix
Christoph Boeddeker
Reinhold Haeb-Umbach
40
4
0
28 Jul 2022
Heterogeneous Separation Consistency Training for Adaptation of Unsupervised Speech Separation
Jiangyu Han
Yanhua Long
28
6
0
23 Apr 2022
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation System
M. Z. Ozturk
Chenshu Wu
Beibei Wang
Min Wu
K. Liu
27
20
0
14 Apr 2022
Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness
Dianwen Ng
Jing Pang
Yanghua Xiao
Biao Tian
Qiang Fu
Eng Siong Chng
27
2
0
11 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
22
32
0
08 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
23
16
0
04 Apr 2022
End-to-End Multi-speaker ASR with Independent Vector Analysis
Robin Scheibler
Wangyou Zhang
Xuankai Chang
Shinji Watanabe
Y. Qian
24
2
0
01 Apr 2022
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Zexu Pan
Meng Ge
Haizhou Li
23
17
0
31 Mar 2022
Speaker Extraction with Co-Speech Gestures Cue
Zexu Pan
Xinyuan Qian
Haizhou Li
SLR
21
27
0
31 Mar 2022
Effective data screening technique for crowdsourced speech intelligibility experiments: Evaluation with IRM-based speech enhancement
Ayako Yamamoto
Toshio Irino
S. Araki
Kenichi Arai
A. Ogawa
K. Kinoshita
Tomohiro Nakatani
17
2
0
31 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
26
109
0
14 Mar 2022
Single microphone speaker extraction using unified time-frequency Siamese-Unet
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
30
3
0
06 Mar 2022
Audio-visual speech separation based on joint feature representation with cross-modal attention
Jun Xiong
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
Yanni Zhang
25
3
0
05 Mar 2022
Tight integration of neural- and clustering-based diarization through deep unfolding of infinite Gaussian mixture model
K. Kinoshita
Marc Delcroix
Tomoharu Iwata
BDL
25
19
0
14 Feb 2022
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training
Ertuğ Karamatlı
S. Kırbız
SSL
36
9
0
08 Feb 2022
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation
Chenda Li
Lei Yang
Weiqin Wang
Y. Qian
32
25
0
26 Jan 2022
Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features
Yicheng Hsu
Yonghan Lee
M. Bai
22
10
0
10 Dec 2021
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information
Zhihao Du
Shiliang Zhang
Siqi Zheng
Weilong Huang
Ming Lei
BDL
16
1
0
28 Nov 2021
Switching Independent Vector Analysis and Its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms
Tomohiro Nakatani
Rintaro Ikeshita
K. Kinoshita
H. Sawada
Naoyuki Kamo
S. Araki
33
8
0
20 Nov 2021
Single-channel speech separation using Soft-minimum Permutation Invariant Training
Midia Yousefi
John H. L. Hansen
21
3
0
16 Nov 2021
Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks
Nils L. Westhausen
R. Huber
Hannah Baumgartner
Ragini Sinha
J. Rennies
B. Meyer
30
10
0
02 Nov 2021
SA-SDR: A novel loss function for separation of meeting style data
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
29
20
0
29 Oct 2021
Continuous Speech Separation with Recurrent Selective Attention Network
Yixuan Zhang
Zhuo Chen
Jian Wu
Takuya Yoshioka
Peidong Wang
Zhong Meng
Jinyu Li
BDL
27
7
0
28 Oct 2021
1
2
3
Next