OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality InteractionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Investigating and Enhancing Vision-Audio Capability in Omnimodal Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
OneLLM: One Framework to Align All Modalities with LanguageComputer Vision and Pattern Recognition (CVPR), 2023 |
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid EmotionsComputer Vision and Pattern Recognition (CVPR), 2024 |
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception Run Luo Yunshui Li Longze Chen Wanwei He Ting-En Lin ...Zikai Song Xiaobo Xia Tongliang Liu Min Yang Binyuan Hui |