SepALM: Audio Language Models Are Error Correctors for Robust Speech SeparationInternational Joint Conference on Artificial Intelligence (IJCAI), 2025 |
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech GenerationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025 |
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice EnhancementIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025 |
EmoDubber: Towards High Quality and Emotion Controllable Movie DubbingComputer Vision and Pattern Recognition (CVPR), 2024 |
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow MatchingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |