ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.14101
  4. Cited By
Multi-Source Spatial Knowledge Understanding for Immersive Visual
  Text-to-Speech

Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech

18 October 2024
Shuwei He
Rui Liu
Hong Li
ArXiv (abs)PDFHTMLGithub

Papers citing "Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech"

7 / 7 papers shown
Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Xinlei Niu
Jianbo Ma
Dylan Harper-Harris
Xiangyu Zhang
Charles Patrick Martin
Jing Zhang
DiffMVGen
146
0
0
19 Sep 2025
MEAN-RIR: Multi-Modal Environment-Aware Network for Robust Room Impulse Response Estimation
MEAN-RIR: Multi-Modal Environment-Aware Network for Robust Room Impulse Response Estimation
Jiajian Chen
Jiakang Chen
Hang Chen
Qing Wang
Yu Gao
Jun Du
136
2
0
05 Sep 2025
VS-Singer: Vision-Guided Stereo Singing Voice Synthesis with Consistency Schrödinger Bridge
VS-Singer: Vision-Guided Stereo Singing Voice Synthesis with Consistency Schrödinger Bridge
Zijing Zhao
Kai Wang
Hao-Ming Huang
Ying Hu
Liang He
J. Yang
205
0
0
19 Jun 2025
ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability
ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability
Wataru Nakata
Yuma Koizumi
Shigeki Karita
Robin Scheibler
Haruko Ishikawa
Adriana Guevara-Rukoz
Heiga Zen
M. Bacchiani
481
2
0
08 May 2025
HELPNet: Hierarchical Perturbations Consistency and Entropy-guided
  Ensemble for Scribble Supervised Medical Image Segmentation
HELPNet: Hierarchical Perturbations Consistency and Entropy-guided Ensemble for Scribble Supervised Medical Image Segmentation
Xiao Zhang
Shaoxuan Wu
Peilin Zhang
Zhuo Jin
Xiaosong Xiong
Qirong Bu
Jingkun Chen
Jun Feng
310
9
0
25 Dec 2024
Intra- and Inter-modal Context Interaction Modeling for Conversational
  Speech Synthesis
Intra- and Inter-modal Context Interaction Modeling for Conversational Speech SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Zhenqi Jia
Rui Liu
215
6
0
25 Dec 2024
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-SpeechAAAI Conference on Artificial Intelligence (AAAI), 2024
Rui Liu
Shuwei He
Yifan Hu
Hong Li
VLM
513
8
0
16 Dec 2024
1
Page 1 of 1