ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.08103
  4. Cited By
Simultaneous Speech Recognition and Speaker Diarization for Monaural
  Dialogue Recordings with Target-Speaker Acoustic Models

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models

17 September 2019
Naoyuki Kanda
Shota Horiguchi
Yusuke Fujita
Yawen Xue
Kenji Nagamatsu
Shinji Watanabe
ArXiv (abs)PDFHTML

Papers citing "Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models"

14 / 14 papers shown
Title
Guided Speaker Embedding
Guided Speaker Embedding
Shota Horiguchi
Takafumi Moriya
Atsushi Ando
Takanori Ashihara
Hiroshi Sato
Naohiro Tawara
Marc Delcroix
124
1
0
03 Jan 2025
VarArray Meets t-SOT: Advancing the State of the Art of Streaming
  Distant Conversational Speech Recognition
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Naoyuki Kanda
Jian Wu
Xiaofei Wang
Zhuo Chen
Jinyu Li
Takuya Yoshioka
88
18
0
12 Sep 2022
Improving the Naturalness of Simulated Conversations for End-to-End
  Neural Diarization
Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization
Natsuo Yamashita
Shota Horiguchi
Takeshi Homma
74
18
0
24 Apr 2022
Graph-PIT: Generalized permutation invariant training for continuous
  separation of arbitrary numbers of speakers
Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
65
23
0
30 Jul 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
73
14
0
06 Jul 2021
Encoder-Decoder Based Attractors for End-to-End Neural Diarization
Encoder-Decoder Based Attractors for End-to-End Neural Diarization
Shota Horiguchi
Yusuke Fujita
Shinji Watanabe
Yawen Xue
Leibny Paola García-Perera
74
68
0
20 Jun 2021
End-to-end Neural Diarization: From Transformer to Conformer
End-to-end Neural Diarization: From Transformer to Conformer
Yi Y. Liu
Eunjung Han
Chul Lee
A. Stolcke
133
41
0
14 Jun 2021
Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural
  Diarization
Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization
Yuki Takashima
Yusuke Fujita
Shota Horiguchi
Shinji Watanabe
Paola García
Kenji Nagamatsu
87
15
0
09 Jun 2021
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
Ryo Masumura
54
8
0
02 Mar 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning
A Review of Speaker Diarization: Recent Advances with Deep Learning
Tae Jin Park
Naoyuki Kanda
Dimitrios Dimitriadis
Kyu Jeong Han
Shinji Watanabe
Shrikanth Narayanan
VLM
382
337
0
24 Jan 2021
Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form
  Multi-talker Recordings
Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings
Xuankai Chang
Naoyuki Kanda
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Takuya Yoshioka
RALM
66
15
0
06 Jan 2021
BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a
  Variable Number of Speakers
BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a Variable Number of Speakers
Eunjung Han
Chul Lee
A. Stolcke
118
42
0
05 Nov 2020
Target-Speaker Voice Activity Detection: a Novel Approach for
  Multi-Speaker Diarization in a Dinner Party Scenario
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Ivan Medennikov
M. Korenevsky
Tatiana Prisyach
Yuri Y. Khokhlov
Mariya Korenevskaya
...
Anton Mitrofanov
A. Andrusenko
Ivan Podluzhny
A. Laptev
A. Romanenko
61
205
0
14 May 2020
Continuous speech separation: dataset and analysis
Continuous speech separation: dataset and analysis
Zhuo Chen
Takuya Yoshioka
Liang Lu
Tianyan Zhou
Zhong Meng
Yi Luo
Jian Wu
Xiong Xiao
Jinyu Li
109
217
0
30 Jan 2020
1