End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning

Interspeech (Interspeech), 2019

13 August 2019

Pavel Denisov

Ngoc Thang Vu

ArXiv (abs)PDF HTML

Papers citing "End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning"

16 / 16 papers shown

Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio

Xinlu He

Jacob Whitehill

326

16 May 2025

Alignment-Free Training for Transducer-based Multi-Talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Marc Delcroix

262

30 Sep 2024

SURT 2.0: Advances in Transducer-based Multi-talker Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Desh Raj

Daniel Povey

Sanjeev Khudanpur

VLM

393

18 Jun 2023

End-to-End Joint Target and Non-Target Speakers ASRInterspeech (Interspeech), 2023

...

Atsushi Ando

149

04 Jun 2023

Neural Target Speech Extraction: An OverviewIEEE Signal Processing Magazine (IEEE Signal Process. Mag.), 2023

238

146

31 Jan 2023

Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy EnvironmentsAutomatic Speech Recognition & Understanding (ASRU), 2022

347

16 Nov 2022

Combining Contrastive and Non-Contrastive Losses for Fine-Tuning Pretrained Models in Speech AnalysisSpoken Language Technology Workshop (SLT), 2022

Florian Lux

Ching-Yi Chen

Ngoc Thang Vu

122

21 Oct 2022

Streaming Target-Speaker ASR with Neural TransducerInterspeech (Interspeech), 2022

377

09 Sep 2022

Closing the Gap between Single-User and Multi-User VoiceFilter-LiteThe Speaker and Language Recognition Workshop (Odyssey), 2022

210

24 Feb 2022

Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognitionAutomatic Speech Recognition & Understanding (ASRU), 2021

Midia Yousefi

John H.L. Hanse

197

30 Oct 2021

Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching SpeechComputer Speech and Language (CSL), 2021

213

29 Aug 2021

Multi-user VoiceFilter-Lite via Attentive Speaker Embedding

344

02 Jul 2021

Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech RecognitionInterspeech (Interspeech), 2021

171

02 Jun 2021

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech RecognitionInterspeech (Interspeech), 2020

...

225

09 Sep 2020

Training for Speech Recognition on Coprocessors

Sebastian Baunsgaard

S. Wrede

Pınar Tözün

176

22 Mar 2020

Supervised Speaker Embedding De-Mixing in Two-Speaker EnvironmentSpoken Language Technology Workshop (SLT), 2020

Yanpei Shi

Thomas Hain

148

14 Jan 2020