ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.05296
113
2

WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool

5 September 2025
Zizun Li
Jianjun Zhou
Yifan Wang
Haoyu Guo
Wenzheng Chang
Y. Zhou
Haoyi Zhu
Junyi Chen
Chunhua Shen
Tong He
    VGen3DV
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)Github (17★)
Main:9 Pages
5 Figures
Bibliography:5 Pages
6 Tables
Abstract

We present WinT3R, a feed-forward reconstruction model capable of online prediction of precise camera poses and high-quality point maps. Previous methods suffer from a trade-off between reconstruction quality and real-time performance. To address this, we first introduce a sliding window mechanism that ensures sufficient information exchange among frames within the window, thereby improving the quality of geometric predictions without large computation. In addition, we leverage a compact representation of cameras and maintain a global camera token pool, which enhances the reliability of camera pose estimation without sacrificing efficiency. These designs enable WinT3R to achieve state-of-the-art performance in terms of online reconstruction quality, camera pose estimation, and reconstruction speed, as validated by extensive experiments on diverse datasets. Code and model are publicly available atthis https URL.

View on arXiv
Comments on this paper