v1v2 (latest)

Streaming Multi-speaker ASR with RNN-T

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

23 November 2020

Papers citing "Streaming Multi-speaker ASR with RNN-T"

31 / 31 papers shown

Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams

134

04 Oct 2025

SEAL: Speaker Error Correction using Acoustic-conditioned Large Language ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

295

14 Jan 2025

Alignment-Free Training for Transducer-based Multi-Talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Marc Delcroix

264

30 Sep 2024

AG-LSEC: Audio Grounded Lexical Speaker Error Correction

Rohit Paturi

Xiang Li

S. Srinivasan

248

25 Jun 2024

SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR

Linhao Dong

236

04 Mar 2024

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token PredictionIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024

208

03 Jan 2024

End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis

296

16 Oct 2023

t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation CapabilityIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Zhuo Chen

208

15 Sep 2023

Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel AudioIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Yang Zhang

Krishna C. Puvvada

Vitaly Lavrukhin

Boris Ginsburg

193

09 Aug 2023

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning RepresentationIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023

Wangyou Zhang

248

23 Jul 2023

Mixture Encoder for Joint Speech Separation and RecognitionInterspeech (Interspeech), 2023

235

21 Jun 2023

SURT 2.0: Advances in Transducer-based Multi-talker Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Desh Raj

Daniel Povey

Sanjeev Khudanpur

VLM

393

18 Jun 2023

End-to-End Joint Target and Non-Target Speakers ASRInterspeech (Interspeech), 2023

...

Atsushi Ando

152

04 Jun 2023

On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition SystemsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

238

29 Nov 2022

Simulating realistic speech overlaps improves multi-talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

347

27 Oct 2022

VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

347

12 Sep 2022

Streaming Target-Speaker ASR with Neural TransducerInterspeech (Interspeech), 2022

378

09 Sep 2022

Comparison and Analysis of New Curriculum Criteria for End-to-End ASRInterspeech (Interspeech), 2022

Georgios Karakasidis

Tamás Grósz

M. Kurimo

169

10 Aug 2022

Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting TranscriptionInterspeech (Interspeech), 2022

Xianrui Zheng

Chuxu Zhang

P. Woodland

182

08 Jul 2022

Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party SpeechInterspeech (Interspeech), 2022

Ilya Sklyar

A. Piunova

Christian Osendorfer

164

10 May 2022

The RoyalFlush System of Speech Recognition for M2MeT ChallengeIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

254

03 Feb 2022

Streaming Multi-Talker ASR with Token-Level Serialized Output TrainingInterspeech (Interspeech), 2022

508

02 Feb 2022

Endpoint Detection for Streaming End-to-End Multi-talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Liang Lu

Jinyu Li

Yifan Gong

274

24 Jan 2022

Multi-turn RNN-T for streaming recognition of multi-party speechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

375

19 Dec 2021

Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021

Jinyu Li

VLM

509

443

02 Nov 2021

Continuous Streaming Multi-Talker ASR with Dual-path Transducers

154

17 Sep 2021

A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio

Jian Wu

200

06 Jul 2021

Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation

292

04 Jul 2021

End-to-End Speaker-Attributed ASR with TransformerInterspeech (Interspeech), 2021

260

05 Apr 2021

Streaming Multi-talker Speech Recognition with Joint Speaker IdentificationInterspeech (Interspeech), 2021

239

05 Apr 2021

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant MicrophoneInterspeech (Interspeech), 2021

323

31 Mar 2021