JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 |
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial TrainingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 |
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical VectorIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2024 |