Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2410.14101
Cited By

Multi-Source Spatial Knowledge Understanding for Immersive Visual
Text-to-Speech

Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech

18 October 2024

ArXiv (abs)PDF HTML Github

Papers citing "Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech"

7 / 7 papers shown

Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech

Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech

Dylan Harper-Harris

Charles Patrick Martin

146

0

0

19 Sep 2025

MEAN-RIR: Multi-Modal Environment-Aware Network for Robust Room Impulse Response Estimation

MEAN-RIR: Multi-Modal Environment-Aware Network for Robust Room Impulse Response Estimation

136

2

0

05 Sep 2025

VS-Singer: Vision-Guided Stereo Singing Voice Synthesis with Consistency Schrödinger Bridge

VS-Singer: Vision-Guided Stereo Singing Voice Synthesis with Consistency Schrödinger Bridge

205

0

0

19 Jun 2025

ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability

ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability

Robin Scheibler

Haruko Ishikawa

Adriana Guevara-Rukoz

481

2

0

08 May 2025

HELPNet: Hierarchical Perturbations Consistency and Entropy-guided
Ensemble for Scribble Supervised Medical Image Segmentation

HELPNet: Hierarchical Perturbations Consistency and Entropy-guided Ensemble for Scribble Supervised Medical Image Segmentation

310

9

0

25 Dec 2024

Intra- and Inter-modal Context Interaction Modeling for Conversational
Speech Synthesis

Intra- and Inter-modal Context Interaction Modeling for Conversational Speech SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

215

6

0

25 Dec 2024

Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech

Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-SpeechAAAI Conference on Artificial Intelligence (AAAI), 2024

513

8

0

16 Dec 2024

Page 1 of 1