ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.23171
384
11
v1v2 (latest)

RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer

29 May 2025
Liu Liu
Xiaofeng Wang
Guosheng Zhao
Keyu Li
Wenkang Qin
Jiaxiong Qiu
Zheng Hua Zhu
Guan Huang
Zhizhong Su
    VGen
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)
Main:12 Pages
15 Figures
Bibliography:4 Pages
4 Tables
Abstract

Imitation Learning has become a fundamental approach in robotic manipulation. However, collecting large-scale real-world robot demonstrations is prohibitively expensive. Simulators offer a cost-effective alternative, but the sim-to-real gap make it extremely challenging to scale. Therefore, we introduce RoboTransfer, a diffusion-based video generation framework for robotic data synthesis. Unlike previous methods, RoboTransfer integrates multi-view geometry with explicit control over scene components, such as background and object attributes. By incorporating cross-view feature interactions and global depth/normal conditions, RoboTransfer ensures geometry consistency across views. This framework allows fine-grained control, including background edits and object swaps. Experiments demonstrate that RoboTransfer is capable of generating multi-view videos with enhanced geometric consistency and visual fidelity. In addition, policies trained on the data generated by RoboTransfer achieve a 33.3% relative improvement in the success rate in the DIFF-OBJ setting and a substantial 251% relative improvement in the more challenging DIFF-ALL scenario. Explore more demos on our project page: this https URL

View on arXiv
Comments on this paper