ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2511.07222
53
0

Omni-View: Unlocking How Generation Facilitates Understanding in Unified 3D Model based on Multiview images

10 November 2025
Jiakui Hu
Shanshan Zhao
Qing-Guo Chen
Xuerui Qiu
Jialun Liu
Zhao Xu
Weihua Luo
Kaifu Zhang
Yanye Lu
    VGen
ArXiv (abs)PDFHTMLGithub (24624★)
Main:10 Pages
1 Figures
Bibliography:3 Pages
7 Tables
Appendix:4 Pages
Abstract

This paper presents Omni-View, which extends the unified multimodal understanding and generation to 3D scenes based on multiview images, exploring the principle that "generation facilitates understanding". Consisting of understanding model, texture module, and geometry module, Omni-View jointly models scene understanding, novel view synthesis, and geometry estimation, enabling synergistic interaction between 3D scene understanding and generation tasks. By design, it leverages the spatiotemporal modeling capabilities of its texture module responsible for appearance synthesis, alongside the explicit geometric constraints provided by its dedicated geometry module, thereby enriching the model's holistic understanding of 3D scenes. Trained with a two-stage strategy, Omni-View achieves a state-of-the-art score of 55.4 on the VSI-Bench benchmark, outperforming existing specialized 3D understanding models, while simultaneously delivering strong performance in both novel view synthesis and 3D scene generation.

View on arXiv
Comments on this paper