ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.11671
  4. Cited By
Streaming Multi-speaker ASR with RNN-T
v1v2 (latest)

Streaming Multi-speaker ASR with RNN-T

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
23 November 2020
Ilya Sklyar
A. Piunova
Yulan Liu
ArXiv (abs)PDFHTML

Papers citing "Streaming Multi-speaker ASR with RNN-T"

31 / 31 papers shown
Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams
Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams
Xiluo He
Alexander Polok
Jesus Villalba
Thomas Thebaud
Matthew Maciejewski
134
2
0
04 Oct 2025
SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
SEAL: Speaker Error Correction using Acoustic-conditioned Large Language ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Anurag Kumar
Rohit Paturi
Amber Afshan
S. Srinivasan
295
2
0
14 Jan 2025
Alignment-Free Training for Transducer-based Multi-Talker ASR
Alignment-Free Training for Transducer-based Multi-Talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Takafumi Moriya
Shota Horiguchi
Marc Delcroix
Ryo Masumura
Takanori Ashihara
Hiroshi Sato
Kohei Matsuura
Masato Mimura
264
9
0
30 Sep 2024
AG-LSEC: Audio Grounded Lexical Speaker Error Correction
AG-LSEC: Audio Grounded Lexical Speaker Error Correction
Rohit Paturi
Xiang Li
S. Srinivasan
248
3
0
25 Jun 2024
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
Zhiyun Fan
Linhao Dong
Jun Zhang
Lu Lu
Zejun Ma
236
12
0
04 Mar 2024
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic
  Token Prediction
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token PredictionIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Semin Kim
Joun Yeop Lee
Nam Soo Kim
AI4TS
208
4
0
03 Jan 2024
End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder
  and Input Feature Analysis
End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis
Can Cui
Imran A. Sheikh
Mostafa Sadeghi
Emmanuel Vincent
296
5
0
16 Oct 2023
t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation
  Capability
t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation CapabilityIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jian Wu
Naoyuki Kanda
Takuya Yoshioka
Rui Zhao
Zhuo Chen
Jinyu Li
208
6
0
15 Sep 2023
Conformer-based Target-Speaker Automatic Speech Recognition for
  Single-Channel Audio
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel AudioIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yang Zhang
Krishna C. Puvvada
Vitaly Lavrukhin
Boris Ginsburg
193
21
0
09 Aug 2023
Exploring the Integration of Speech Separation and Recognition with
  Self-Supervised Learning Representation
Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning RepresentationIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
Yoshiki Masuyama
Xuankai Chang
Wangyou Zhang
Samuele Cornell
Zhongqiu Wang
Nobutaka Ono
Y. Qian
Shinji Watanabe
248
8
0
23 Jul 2023
Mixture Encoder for Joint Speech Separation and Recognition
Mixture Encoder for Joint Speech Separation and RecognitionInterspeech (Interspeech), 2023
Simon Berger
Peter Vieting
Christoph Boeddeker
Ralf Schluter
Reinhold Häb-Umbach
235
8
0
21 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
SURT 2.0: Advances in Transducer-based Multi-talker Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
393
17
0
18 Jun 2023
End-to-End Joint Target and Non-Target Speakers ASR
End-to-End Joint Target and Non-Target Speakers ASRInterspeech (Interspeech), 2023
Ryo Masumura
Naoki Makishima
Taiga Yamane
Yoshihiko Yamazaki
Saki Mizuno
...
Akihiko Takashima
Satoshi Suzuki
Takafumi Moriya
Nobukatsu Hojo
Atsushi Ando
152
8
0
04 Jun 2023
On Word Error Rate Definitions and their Efficient Computation for
  Multi-Speaker Speech Recognition Systems
On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition SystemsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Thilo von Neumann
Christoph Boeddeker
K. Kinoshita
Marc Delcroix
Reinhold Haeb-Umbach
238
23
0
29 Nov 2022
Simulating realistic speech overlaps improves multi-talker ASR
Simulating realistic speech overlaps improves multi-talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Muqiao Yang
Naoyuki Kanda
Xiaofei Wang
Jian Wu
S. Sivasankaran
Zhuo Chen
Jinyu Li
Takuya Yoshioka
347
17
0
27 Oct 2022
VarArray Meets t-SOT: Advancing the State of the Art of Streaming
  Distant Conversational Speech Recognition
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Naoyuki Kanda
Jian Wu
Xiaofei Wang
Zhuo Chen
Jinyu Li
Takuya Yoshioka
347
19
0
12 Sep 2022
Streaming Target-Speaker ASR with Neural Transducer
Streaming Target-Speaker ASR with Neural TransducerInterspeech (Interspeech), 2022
Takafumi Moriya
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
T. Shinozaki
378
27
0
09 Sep 2022
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
Comparison and Analysis of New Curriculum Criteria for End-to-End ASRInterspeech (Interspeech), 2022
Georgios Karakasidis
Tamás Grósz
M. Kurimo
169
4
0
10 Aug 2022
Tandem Multitask Training of Speaker Diarisation and Speech Recognition
  for Meeting Transcription
Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting TranscriptionInterspeech (Interspeech), 2022
Xianrui Zheng
Chuxu Zhang
P. Woodland
182
19
0
08 Jul 2022
Separator-Transducer-Segmenter: Streaming Recognition and Segmentation
  of Multi-party Speech
Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party SpeechInterspeech (Interspeech), 2022
Ilya Sklyar
A. Piunova
Christian Osendorfer
164
6
0
10 May 2022
The RoyalFlush System of Speech Recognition for M2MeT Challenge
The RoyalFlush System of Speech Recognition for M2MeT ChallengeIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Shuaishuai Ye
Peiyao Wang
Shunfei Chen
Xinhui Hu
Xinkang Xu
254
7
0
03 Feb 2022
Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Streaming Multi-Talker ASR with Token-Level Serialized Output TrainingInterspeech (Interspeech), 2022
Naoyuki Kanda
Jian Wu
Yu Wu
Xiong Xiao
Zhong Meng
Xiaofei Wang
Yashesh Gaur
Zhuo Chen
Jinyu Li
Takuya Yoshioka
508
75
0
02 Feb 2022
Endpoint Detection for Streaming End-to-End Multi-talker ASR
Endpoint Detection for Streaming End-to-End Multi-talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Liang Lu
Jinyu Li
Yifan Gong
274
21
0
24 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech
Multi-turn RNN-T for streaming recognition of multi-party speechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Ilya Sklyar
A. Piunova
Xianrui Zheng
Yulan Liu
375
29
0
19 Dec 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
509
443
0
02 Nov 2021
Continuous Streaming Multi-Talker ASR with Dual-path Transducers
Continuous Streaming Multi-Talker ASR with Dual-path Transducers
Desh Raj
Liang Lu
Zhuo Chen
Yashesh Gaur
Jinyu Li
154
19
0
17 Sep 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
200
17
0
06 Jul 2021
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker
  Overlapped Speech Recognition and Speaker Attribute Estimation
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Ryo Masumura
Daiki Okamura
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
292
8
0
04 Jul 2021
End-to-End Speaker-Attributed ASR with Transformer
End-to-End Speaker-Attributed ASR with TransformerInterspeech (Interspeech), 2021
Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
260
58
0
05 Apr 2021
Streaming Multi-talker Speech Recognition with Joint Speaker
  Identification
Streaming Multi-talker Speech Recognition with Joint Speaker IdentificationInterspeech (Interspeech), 2021
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
239
22
0
05 Apr 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting
  Transcription with Single Distant Microphone
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant MicrophoneInterspeech (Interspeech), 2021
Naoyuki Kanda
Guoli Ye
Yu-Huan Wu
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
323
49
0
31 Mar 2021
1
Page 1 of 1