Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.16956
Cited By
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech
21 March 2025
Ji-Hoon Kim
Jeongsoo Choi
Jaehun Kim
Chaeyoung Jung
Joon Son Chung
CVBM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech"
1 / 1 papers shown
Title
AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
J. Choi
Ji-Hoon Kim
Kim Sung-Bin
Tae-Hyun Oh
Joon Son Chung
DiffM
48
0
0
29 Apr 2025
1