
Title |
|---|
![]() From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-SpeechComputer Vision and Pattern Recognition (CVPR), 2025 |
![]() Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You ThinkInternational Conference on Learning Representations (ICLR), 2024 |
![]() F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow MatchingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |