ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.04737
  4. Cited By
End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and
  Transfer Learning

End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning

Interspeech (Interspeech), 2019
13 August 2019
Pavel Denisov
Ngoc Thang Vu
ArXiv (abs)PDFHTML

Papers citing "End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning"

16 / 16 papers shown
Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio
Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio
Xinlu He
Jacob Whitehill
326
4
0
16 May 2025
Alignment-Free Training for Transducer-based Multi-Talker ASR
Alignment-Free Training for Transducer-based Multi-Talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Takafumi Moriya
Shota Horiguchi
Marc Delcroix
Ryo Masumura
Takanori Ashihara
Hiroshi Sato
Kohei Matsuura
Masato Mimura
262
9
0
30 Sep 2024
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
SURT 2.0: Advances in Transducer-based Multi-talker Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
393
17
0
18 Jun 2023
End-to-End Joint Target and Non-Target Speakers ASR
End-to-End Joint Target and Non-Target Speakers ASRInterspeech (Interspeech), 2023
Ryo Masumura
Naoki Makishima
Taiga Yamane
Yoshihiko Yamazaki
Saki Mizuno
...
Akihiko Takashima
Satoshi Suzuki
Takafumi Moriya
Nobukatsu Hojo
Atsushi Ando
149
8
0
04 Jun 2023
Neural Target Speech Extraction: An Overview
Neural Target Speech Extraction: An OverviewIEEE Signal Processing Magazine (IEEE Signal Process. Mag.), 2023
Kateřina Žmolíková
Marc Delcroix
Tsubasa Ochiai
K. Kinoshita
JanHonza'' vCernocký
Dong Yu
238
146
0
31 Jan 2023
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy
  Environments
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy EnvironmentsAutomatic Speech Recognition & Understanding (ASRU), 2022
Dominik Wagner
Ilja Baumann
Sebastian P. Bayerl
Korbinian Riedhammer
Tobias Bocklet
347
3
0
16 Nov 2022
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning
  Pretrained Models in Speech Analysis
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning Pretrained Models in Speech AnalysisSpoken Language Technology Workshop (SLT), 2022
Florian Lux
Ching-Yi Chen
Ngoc Thang Vu
122
1
0
21 Oct 2022
Streaming Target-Speaker ASR with Neural Transducer
Streaming Target-Speaker ASR with Neural TransducerInterspeech (Interspeech), 2022
Takafumi Moriya
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
T. Shinozaki
377
27
0
09 Sep 2022
Closing the Gap between Single-User and Multi-User VoiceFilter-Lite
Closing the Gap between Single-User and Multi-User VoiceFilter-LiteThe Speaker and Language Recognition Workshop (Odyssey), 2022
R. Rikhye
Quan Wang
Qiao Liang
Yanzhang He
Ian McGraw
VLM
210
8
0
24 Feb 2022
Speaker conditioning of acoustic models using affine transformation for
  multi-speaker speech recognition
Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognitionAutomatic Speech Recognition & Understanding (ASRU), 2021
Midia Yousefi
John H.L. Hanse
197
7
0
30 Oct 2021
Investigations on Speech Recognition Systems for Low-Resource Dialectal
  Arabic-English Code-Switching Speech
Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching SpeechComputer Speech and Language (CSL), 2021
Injy Hamed
Pavel Denisov
C. Li
Mohamed S. Elmahdy
Slim Abdennadher
Ngoc Thang Vu
213
42
0
29 Aug 2021
Multi-user VoiceFilter-Lite via Attentive Speaker Embedding
Multi-user VoiceFilter-Lite via Attentive Speaker Embedding
R. Rikhye
Quan Wang
Qiao Liang
Yanzhang He
Ian McGraw
344
12
0
02 Jul 2021
Should We Always Separate?: Switching Between Enhanced and Observed
  Signals for Overlapping Speech Recognition
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech RecognitionInterspeech (Interspeech), 2021
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Takafumi Moriya
Naoyuki Kamo
171
25
0
02 Jun 2021
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device
  Speech Recognition
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech RecognitionInterspeech (Interspeech), 2020
Quan Wang
Ignacio López Moreno
Mert Saglam
K. Wilson
Alan Chiao
...
Yanzhang He
Wei Li
Jason W. Pelecanos
M. Nika
A. Gruenstein
VLM
225
98
0
09 Sep 2020
Training for Speech Recognition on Coprocessors
Training for Speech Recognition on Coprocessors
Sebastian Baunsgaard
S. Wrede
Pınar Tözün
176
6
0
22 Mar 2020
Supervised Speaker Embedding De-Mixing in Two-Speaker Environment
Supervised Speaker Embedding De-Mixing in Two-Speaker EnvironmentSpoken Language Technology Workshop (SLT), 2020
Yanpei Shi
Thomas Hain
148
7
0
14 Jan 2020
1
Page 1 of 1