17

SynthVerse: A Large-Scale Diverse Synthetic Dataset for Point Tracking

Weiguang Zhao
Haoran Xu
Xingyu Miao
Qin Zhao
Rui Zhang
Kaizhu Huang
Ning Gao
Peizhou Cao
Mingze Sun
Mulin Yu
Tao Lu
Linning Xu
Junting Dong
Jiangmiao Pang
Main:9 Pages
6 Figures
Bibliography:3 Pages
6 Tables
Abstract

Point tracking aims to follow visual points through complex motion, occlusion, and viewpoint changes, and has advanced rapidly with modern foundation models. Yet progress toward general point tracking remains constrained by limited high-quality data, as existing datasets often provide insufficient diversity and imperfect trajectory annotations. To this end, we introduce SynthVerse, a large-scale, diverse synthetic dataset specifically designed for point tracking. SynthVerse includes several new domains and object types missing from existing synthetic datasets, such as animated-film-style content, embodied manipulation, scene navigation, and articulated objects. SynthVerse substantially expands dataset diversity by covering a broader range of object categories and providing high-quality dynamic motions and interactions, enabling more robust training and evaluation for general point tracking. In addition, we establish a highly diverse point tracking benchmark to systematically evaluate state-of-the-art methods under broader domain shifts. Extensive experiments and analyses demonstrate that training with SynthVerse yields consistent improvements in generalization and reveal limitations of existing trackers under diverse settings.

View on arXiv
Comments on this paper