Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2011.11671
Cited By
v1
v2 (latest)
Streaming Multi-speaker ASR with RNN-T
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
23 November 2020
Ilya Sklyar
A. Piunova
Yulan Liu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Streaming Multi-speaker ASR with RNN-T"
31 / 31 papers shown
Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams
Xiluo He
Alexander Polok
Jesus Villalba
Thomas Thebaud
Matthew Maciejewski
134
2
0
04 Oct 2025
SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Anurag Kumar
Rohit Paturi
Amber Afshan
S. Srinivasan
295
2
0
14 Jan 2025
Alignment-Free Training for Transducer-based Multi-Talker ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Takafumi Moriya
Shota Horiguchi
Marc Delcroix
Ryo Masumura
Takanori Ashihara
Hiroshi Sato
Kohei Matsuura
Masato Mimura
264
9
0
30 Sep 2024
AG-LSEC: Audio Grounded Lexical Speaker Error Correction
Rohit Paturi
Xiang Li
S. Srinivasan
248
3
0
25 Jun 2024
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
Zhiyun Fan
Linhao Dong
Jun Zhang
Lu Lu
Zejun Ma
236
12
0
04 Mar 2024
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
IEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Semin Kim
Joun Yeop Lee
Nam Soo Kim
AI4TS
208
4
0
03 Jan 2024
End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis
Can Cui
Imran A. Sheikh
Mostafa Sadeghi
Emmanuel Vincent
296
5
0
16 Oct 2023
t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jian Wu
Naoyuki Kanda
Takuya Yoshioka
Rui Zhao
Zhuo Chen
Jinyu Li
208
6
0
15 Sep 2023
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yang Zhang
Krishna C. Puvvada
Vitaly Lavrukhin
Boris Ginsburg
193
21
0
09 Aug 2023
Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
Yoshiki Masuyama
Xuankai Chang
Wangyou Zhang
Samuele Cornell
Zhongqiu Wang
Nobutaka Ono
Y. Qian
Shinji Watanabe
248
8
0
23 Jul 2023
Mixture Encoder for Joint Speech Separation and Recognition
Interspeech (Interspeech), 2023
Simon Berger
Peter Vieting
Christoph Boeddeker
Ralf Schluter
Reinhold Häb-Umbach
235
8
0
21 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
393
17
0
18 Jun 2023
End-to-End Joint Target and Non-Target Speakers ASR
Interspeech (Interspeech), 2023
Ryo Masumura
Naoki Makishima
Taiga Yamane
Yoshihiko Yamazaki
Saki Mizuno
...
Akihiko Takashima
Satoshi Suzuki
Takafumi Moriya
Nobukatsu Hojo
Atsushi Ando
152
8
0
04 Jun 2023
On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Thilo von Neumann
Christoph Boeddeker
K. Kinoshita
Marc Delcroix
Reinhold Haeb-Umbach
238
23
0
29 Nov 2022
Simulating realistic speech overlaps improves multi-talker ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Muqiao Yang
Naoyuki Kanda
Xiaofei Wang
Jian Wu
S. Sivasankaran
Zhuo Chen
Jinyu Li
Takuya Yoshioka
347
17
0
27 Oct 2022
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Naoyuki Kanda
Jian Wu
Xiaofei Wang
Zhuo Chen
Jinyu Li
Takuya Yoshioka
347
19
0
12 Sep 2022
Streaming Target-Speaker ASR with Neural Transducer
Interspeech (Interspeech), 2022
Takafumi Moriya
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
T. Shinozaki
378
27
0
09 Sep 2022
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
Interspeech (Interspeech), 2022
Georgios Karakasidis
Tamás Grósz
M. Kurimo
169
4
0
10 Aug 2022
Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Interspeech (Interspeech), 2022
Xianrui Zheng
Chuxu Zhang
P. Woodland
182
19
0
08 Jul 2022
Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech
Interspeech (Interspeech), 2022
Ilya Sklyar
A. Piunova
Christian Osendorfer
164
6
0
10 May 2022
The RoyalFlush System of Speech Recognition for M2MeT Challenge
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Shuaishuai Ye
Peiyao Wang
Shunfei Chen
Xinhui Hu
Xinkang Xu
254
7
0
03 Feb 2022
Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Interspeech (Interspeech), 2022
Naoyuki Kanda
Jian Wu
Yu Wu
Xiong Xiao
Zhong Meng
Xiaofei Wang
Yashesh Gaur
Zhuo Chen
Jinyu Li
Takuya Yoshioka
508
75
0
02 Feb 2022
Endpoint Detection for Streaming End-to-End Multi-talker ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Liang Lu
Jinyu Li
Yifan Gong
274
21
0
24 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Ilya Sklyar
A. Piunova
Xianrui Zheng
Yulan Liu
375
29
0
19 Dec 2021
Recent Advances in End-to-End Automatic Speech Recognition
APSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
509
443
0
02 Nov 2021
Continuous Streaming Multi-Talker ASR with Dual-path Transducers
Desh Raj
Liang Lu
Zhuo Chen
Yashesh Gaur
Jinyu Li
154
19
0
17 Sep 2021
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
200
17
0
06 Jul 2021
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Ryo Masumura
Daiki Okamura
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
292
8
0
04 Jul 2021
End-to-End Speaker-Attributed ASR with Transformer
Interspeech (Interspeech), 2021
Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
260
58
0
05 Apr 2021
Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Interspeech (Interspeech), 2021
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
239
22
0
05 Apr 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Interspeech (Interspeech), 2021
Naoyuki Kanda
Guoli Ye
Yu-Huan Wu
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
323
49
0
31 Mar 2021
1
Page 1 of 1