ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06527
  4. Cited By
Single-Channel Multi-talker Speech Recognition with Permutation
  Invariant Training

Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training

19 July 2017
Y. Qian
Xuankai Chang
Dong Yu
ArXiv (abs)PDFHTML

Papers citing "Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training"

21 / 21 papers shown
Title
Target Speaker ASR with Whisper
Target Speaker ASR with Whisper
Alexander Polok
Dominik Klement
Sanjeev Khudanpur
Kevin Duh
J. Černocký
L. Burget
181
5
0
17 Jan 2025
Resource-Efficient Adaptation of Speech Foundation Models for
  Multi-Speaker ASR
Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR
Weiqing Wang
Kunal Dhawan
Taejin Park
Krishna Puvvada
Ivan Medennikov
Somshubra Majumdar
He Huang
Jagadeesh Balam
Boris Ginsburg
72
2
0
02 Sep 2024
Coarse-to-Fine Recursive Speech Separation for Unknown Number of
  Speakers
Coarse-to-Fine Recursive Speech Separation for Unknown Number of Speakers
Zhenhao Jin
Xiang Hao
Xiangdong Su
55
4
0
30 Mar 2022
Learning to Enhance or Not: Neural Network-Based Switching of Enhanced
  and Observed Signals for Overlapping Speech Recognition
Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Naoyuki Kamo
Takafumi Moriya
70
27
0
11 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech
Multi-turn RNN-T for streaming recognition of multi-party speech
Ilya Sklyar
A. Piunova
Xianrui Zheng
Yulan Liu
114
24
0
19 Dec 2021
Revisiting joint decoding based multi-talker speech recognition with DNN
  acoustic model
Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model
M. Kocour
Kateřina Žmolíková
Lucas Ondel
J. Svec
Marc Delcroix
Tsubasa Ochiai
L. Burget
J. Černocký
36
1
0
31 Oct 2021
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and
  Conditional Speaker Chain
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain
Pengcheng Guo
Xuankai Chang
Shinji Watanabe
Lei Xie
48
19
0
16 Jun 2021
Should We Always Separate?: Switching Between Enhanced and Observed
  Signals for Overlapping Speech Recognition
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Takafumi Moriya
Naoyuki Kamo
68
23
0
02 Jun 2021
MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation
MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation
Xiyun Li
Yong-mei Xu
Meng Yu
Shi-Xiong Zhang
Jiaming Xu
Bo Xu
Dong Yu
52
14
0
17 Apr 2021
Streaming end-to-end multi-talker speech recognition
Streaming end-to-end multi-talker speech recognition
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
75
44
0
26 Nov 2020
Streaming Multi-speaker ASR with RNN-T
Streaming Multi-speaker ASR with RNN-T
Ilya Sklyar
A. Piunova
Yulan Liu
80
37
0
23 Nov 2020
Sequence to Multi-Sequence Learning via Conditional Chain Mapping for
  Mixture Signals
Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals
Jing Shi
Xuankai Chang
Pengcheng Guo
Shinji Watanabe
Yusuke Fujita
Jiaming Xu
Bo Xu
Lei Xie
96
22
0
25 Jun 2020
Neural Spatio-Temporal Beamformer for Target Speech Separation
Neural Spatio-Temporal Beamformer for Target Speech Separation
Yong-mei Xu
Meng Yu
Shi-Xiong Zhang
Lianwu Chen
Chao Weng
Jianming Liu
Dong Yu
82
41
0
08 May 2020
End-to-end training of time domain audio separation and recognition
End-to-end training of time domain audio separation and recognition
Thilo von Neumann
K. Kinoshita
Lukas Drude
Christoph Boeddeker
Marc Delcroix
Tomohiro Nakatani
Reinhold Haeb-Umbach
76
34
0
18 Dec 2019
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
Xuankai Chang
Wangyou Zhang
Y. Qian
Jonathan Le Roux
Shinji Watanabe
95
121
0
15 Oct 2019
Auxiliary Interference Speaker Loss for Target-Speaker Speech
  Recognition
Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition
Naoyuki Kanda
Shota Horiguchi
R. Takashima
Yusuke Fujita
Kenji Nagamatsu
Shinji Watanabe
68
34
0
26 Jun 2019
End-to-End Monaural Multi-speaker ASR System without Pretraining
End-to-End Monaural Multi-speaker ASR System without Pretraining
Xuankai Chang
Y. Qian
Yi Liang
Deming Chen
87
77
0
05 Nov 2018
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for
  Speech Separation
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Yi Luo
N. Mesgarani
181
1,799
0
20 Sep 2018
Deep Extractor Network for Target Speaker Recovery From Single Channel
  Speech Mixtures
Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures
Jun Wang
Jie Chen
Dan Su
Lianwu Chen
Meng Yu
Y. Qian
Dong Yu
93
91
0
24 Jul 2018
A Purely End-to-end System for Multi-speaker Speech Recognition
A Purely End-to-end System for Multi-speaker Speech Recognition
Hiroshi Seki
Takaaki Hori
Shinji Watanabe
Jonathan Le Roux
J. Hershey
54
89
0
15 May 2018
Progressive Joint Modeling in Unsupervised Single-channel Overlapped
  Speech Recognition
Progressive Joint Modeling in Unsupervised Single-channel Overlapped Speech Recognition
Zhehuai Chen
J. Droppo
Jinyu Li
Wayne Xiong
93
65
0
21 Jul 2017
1