ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.12102
  4. Cited By
Integrating Audio, Visual, and Semantic Information for Enhanced
  Multimodal Speaker Diarization

Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization

22 August 2024
Luyao Cheng
Hui Wang
Siqi Zheng
Yafeng Chen
Rongjie Huang
Qinglin Zhang
Qian Chen
Xihao Li
ArXivPDFHTML

Papers citing "Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization"

4 / 4 papers shown
Title
CAM++: A Fast and Efficient Network for Speaker Verification Using
  Context-Aware Masking
CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking
Haibo Wang
Siqi Zheng
Yafeng Chen
Luyao Cheng
Qian Chen
42
69
0
01 Mar 2023
Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized
  Maximum Eigengap
Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap
Tae Jin Park
Kyu Jeong Han
Manoj Kumar
Shrikanth Narayanan
122
114
0
05 Mar 2020
End-to-End Neural Speaker Diarization with Self-attention
End-to-End Neural Speaker Diarization with Self-attention
Yusuke Fujita
Naoyuki Kanda
Shota Horiguchi
Yawen Xue
Kenji Nagamatsu
Shinji Watanabe
179
237
0
13 Sep 2019
End-to-End Neural Speaker Diarization with Permutation-Free Objectives
End-to-End Neural Speaker Diarization with Permutation-Free Objectives
Yusuke Fujita
Naoyuki Kanda
Shota Horiguchi
Kenji Nagamatsu
Shinji Watanabe
155
243
0
12 Sep 2019
1