ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.12725
  4. Cited By
Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual
  Target Speech Extraction

Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction

19 April 2024
Zhaoxi Mu
Xinyu Yang
ArXivPDFHTML

Papers citing "Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction"

7 / 7 papers shown
Title
SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation
SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation
Zhaoxi Mu
Xinyu Yang
Gang Wang
AuLLM
KELM
VLM
53
0
0
06 May 2025
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
Detao Bai
Zhiheng Ma
Xihan Wei
Liefeng Bo
40
0
0
06 May 2025
Distance Based Single-Channel Target Speech Extraction
Distance Based Single-Channel Target Speech Extraction
Runwu Shi
Benjamin Yen
Kazuhiro Nakadai
23
0
0
31 Dec 2024
Cross-attention Inspired Selective State Space Models for Target Sound
  Extraction
Cross-attention Inspired Selective State Space Models for Target Sound Extraction
Donghang Wu
Yiwen Wang
Xihong Wu
T. Qu
Mamba
26
3
0
07 Sep 2024
AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech
  Separation By Leveraging Narrow- and Cross-Band Modeling
AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Vahid Ahmadi Kalkhorani
Cheng Yu
Anurag Kumar
Ke Tan
Buye Xu
DeLiang Wang
29
0
0
17 Jun 2024
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
185
196
0
08 Jan 2021
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
214
1,954
0
14 Jun 2018
1