v1v2v3v4v5 (latest)

OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis

8 January 2025

ArXiv (abs)PDF HTML HuggingFace (16 upvotes)

Papers citing "OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis"

12 / 12 papers shown

OpenS2S: Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model

...

177

07 Jul 2025

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality InteractionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

322

26 May 2025

WavReward: Spoken Dialogue Models With Generalist Reward Evaluators

...

412

14 May 2025

VCM: Vision Concept Modeling Based on Implicit Contrastive Learning with Vision-Language Instruction Fine-Tuning

526

28 Apr 2025

Investigating and Enhancing Vision-Audio Capability in Omnimodal Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

394

27 Feb 2025

Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision

...

600

26 Feb 2025

OneLLM: One Framework to Align All Modalities with LanguageComputer Vision and Pattern Recognition (CVPR), 2023

577

198

10 Jan 2025

OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

...

Zhihao Du

Shiliang Zhang

SyDa BDL AuLLM VLM

351

23 Oct 2024

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid EmotionsComputer Vision and Pattern Recognition (CVPR), 2024

Kai Chen

Zhili Liu

...

Jun Yao

447

26 Sep 2024

OmniBench: Towards The Future of Universal Omni-Language Models

...

612

23 Sep 2024

VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

...

734

363

16 Jul 2024

DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception

Run Luo

Yunshui Li

Longze Chen

Wanwei He

Ting-En Lin

...

Xiaobo Xia

Min Yang

483

24 May 2024