ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.16956
  4. Cited By
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech

From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech

21 March 2025
Ji-Hoon Kim
Jeongsoo Choi
Jaehun Kim
Chaeyoung Jung
Joon Son Chung
    CVBM
ArXivPDFHTML

Papers citing "From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech"

1 / 1 papers shown
Title
AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
J. Choi
Ji-Hoon Kim
Kim Sung-Bin
Tae-Hyun Oh
Joon Son Chung
DiffM
48
0
0
29 Apr 2025
1