Investigation of End-To-End Speaker-Attributed ASR for Continuous
Multi-Talker Recordings

Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings

11 August 2020

Takuya Yoshioka

Papers citing "Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings"

19 / 19 papers shown

Title
Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio Xinlu He Jacob Whitehill 21 0 0 16 May 2025
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models Junyi Peng Takanori Ashihara Marc Delcroix Tsubasa Ochiai Oldrich Plchot Shoko Araki J. Černocký ELM 34 0 0 10 May 2025
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models Thai-Binh Nguyen Alexander Waibel 84 1 0 27 Nov 2024
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition Guinan Li Jiajun Deng Youjun Chen Mengzhe Geng Shujie Hu ... Zengrui Jin Tianzi Wang Xurong Xie Helen Meng Xunying Liu VLM 36 0 0 14 Jun 2024
On Speaker Attribution with SURT Desh Raj Sanjeev Khudanpur Matthew Maciejewski Leibny Paola García-Perera Daniel Povey Sanjeev Khudanpur 34 3 0 28 Jan 2024
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers Guru Prakash Arumugam Shuo-yiin Chang Tara N. Sainath Rohit Prabhavalkar Quan Wang Shaan Bijwadia 34 3 0 18 Dec 2023
Summaries, Highlights, and Action items: Design, implementation and evaluation of an LLM-powered meeting recap system Sumit Asthana Sagi Hilleli Pengcheng He Aaron L Halfaker 45 11 0 28 Jul 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition Desh Raj Daniel Povey Sanjeev Khudanpur VLM 36 9 0 18 Jun 2023
CASA-ASR: Context-Aware Speaker-Attributed ASR Mohan Shi Zhihao Du Qian Chen Fan Yu Yangze Li Shiliang Zhang Jie Zhang Lirong Dai 36 8 0 21 May 2023
End-to-end multi-talker audio-visual ASR using an active speaker attention module R. Rose Olivier Siohan 26 3 0 01 Apr 2022
Multi-turn RNN-T for streaming recognition of multi-party speech Ilya Sklyar A. Piunova Xianrui Zheng Yulan Liu 26 22 0 19 Dec 2021
Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech Rohit Paturi S. Srinivasan Katrin Kirchhoff Daniel Garcia-Romero 27 9 0 10 Dec 2021
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio Naoyuki Kanda Xiong Xiao Jian Wu Tianyan Zhou Yashesh Gaur Xiaofei Wang Zhong Meng Zhuo Chen Takuya Yoshioka 24 14 0 06 Jul 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning Tae Jin Park Naoyuki Kanda Dimitrios Dimitriadis Kyu Jeong Han Shinji Watanabe Shrikanth Narayanan VLM 274 328 0 24 Jan 2021
Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system K. Kinoshita Marc Delcroix S. Araki Tomohiro Nakatani 197 30 0 09 Mar 2020
Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap Tae Jin Park Kyu Jeong Han Manoj Kumar Shrikanth Narayanan 128 116 0 05 Mar 2020
End-to-End Neural Speaker Diarization with Self-attention Yusuke Fujita Naoyuki Kanda Shota Horiguchi Yawen Xue Kenji Nagamatsu Shinji Watanabe 190 238 0 13 Sep 2019
End-to-End Neural Speaker Diarization with Permutation-Free Objectives Yusuke Fujita Naoyuki Kanda Shota Horiguchi Kenji Nagamatsu Shinji Watanabe 169 247 0 12 Sep 2019
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 266 2,242 0 14 Jun 2018