ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.10901
  4. Cited By
Structured World Models from Human Videos

Structured World Models from Human Videos

21 August 2023
Russell Mendonca
Shikhar Bahl
Deepak Pathak
    LM&Ro
ArXiv (abs)PDFHTML

Papers citing "Structured World Models from Human Videos"

50 / 99 papers shown
ManualVLA: A Unified VLA Model for Chain-of-Thought Manual Generation and Robotic Manipulation
Chenyang Gu
Jiaming Liu
Hao Chen
Runzhong Huang
Qingpo Wuwu
...
Ying Li
Renrui Zhang
Peng Jia
Pheng-Ann Heng
Shanghang Zhang
LM&Ro
157
1
0
01 Dec 2025
SmallWorlds: Assessing Dynamics Understanding of World Models in Isolated Environments
SmallWorlds: Assessing Dynamics Understanding of World Models in Isolated Environments
Xinyi Li
Zaishuo Xia
Weyl Lu
Chenjie Hao
Yubei Chen
137
0
0
28 Nov 2025
Reinforcing Action Policies by Prophesying
Reinforcing Action Policies by Prophesying
Jiahui Zhang
Ze Huang
Chun Gu
Zipei Ma
Li Zhang
233
1
0
25 Nov 2025
In-N-On: Scaling Egocentric Manipulation with in-the-wild and on-task Data
In-N-On: Scaling Egocentric Manipulation with in-the-wild and on-task Data
Xiongyi Cai
Ri-Zhao Qiu
Geng Chen
Lai Wei
Isabella Liu
Tianshu Huang
Xuxin Cheng
Xiaolong Wang
EgoV
356
1
0
19 Nov 2025
Robot Learning from a Physical World Model
Robot Learning from a Physical World Model
Jiageng Mao
Sicheng He
Hao-Ning Wu
Yang You
Shuyang Sun
...
Huizhong Chen
Leonidas Guibas
Vitor Campagnolo Guizilini
Zhengyu Ma
Yue Wang
VGenPINN
421
0
0
10 Nov 2025
TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models
TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models
Hokyun Im
Euijin Jeong
Jianlong Fu
Andrey Kolobov
Youngwoon Lee
84
0
0
07 Nov 2025
Learning Interactive World Model for Object-Centric Reinforcement Learning
Learning Interactive World Model for Object-Centric Reinforcement Learning
Fan Feng
Phillip Lippe
Sara Magliacane
OffRLOCL
312
0
0
04 Nov 2025
XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
Shichao Fan
K. Wu
Zhengping Che
X. Wang
Di Wu
...
M. M. Li
Qingjie Liu
Shanghang Zhang
Min Wan
Yong Dai
247
1
0
04 Nov 2025
Clone Deterministic 3D Worlds
Clone Deterministic 3D Worlds
Zaishuo Xia
Yukuan Lu
Xinyi Li
Yifan Xu
Yubei Chen
155
0
0
30 Oct 2025
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Qixiu Li
Yu Deng
Yaobo Liang
L. Luo
Lei Zhou
...
Hao Chen
Lily Sun
Dong Chen
J. Yang
B. Guo
129
7
0
24 Oct 2025
MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation
MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation
Zhuoyang Liu
Jiaming Liu
Jiadong Xu
Nuowei Han
Chenyang Gu
...
Kai Chin Hsieh
K. Wu
Zhengping Che
Yong Dai
Shanghang Zhang
LM&Ro
124
4
0
30 Sep 2025
IGFuse: Interactive 3D Gaussian Scene Reconstruction via Multi-Scans Fusion
IGFuse: Interactive 3D Gaussian Scene Reconstruction via Multi-Scans Fusion
Wenhao Hu
Zesheng Li
Haonan Zhou
Liu Liu
Xuexiang Wen
Zhizhong Su
Xi Li
Gaoang Wang
3DGS
130
0
0
18 Aug 2025
Visuomotor Grasping with World Models for Surgical Robots
Visuomotor Grasping with World Models for Surgical Robots
Hongbin Lin
Bin Li
K. W. S. Au
153
1
0
15 Aug 2025
Large Model Empowered Embodied AI: A Survey on Decision-Making and Embodied Learning
Large Model Empowered Embodied AI: A Survey on Decision-Making and Embodied Learning
Wenlong Liang
Rui Zhou
Yang Ma
Bing Zhang
Songlin Li
Yijia Liao
Ping Kuang
LM&Ro3DVAI4CE
169
8
0
14 Aug 2025
villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models
villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models
Xiaoyu Chen
Hangxing Wei
Pushi Zhang
Chuheng Zhang
Kaixin Wang
...
Yucen Wang
Xinquan Xiao
Li Zhao
Jianyu Chen
Jiang Bian
LM&Ro
361
15
0
31 Jul 2025
GR-3 Technical Report
GR-3 Technical Report
Chilam Cheang
S. Chen
Zhongren Cui
Yingdong Hu
Liqun Huang
...
Hongtao Wu
Xin Xiao
Yuyang Xiao
Jiafeng Xu
Yichu Yang
320
46
0
21 Jul 2025
Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Yiqi Wang
Mrinal Verghese
Jeff Schneider
251
4
0
17 Jul 2025
A Survey: Learning Embodied Intelligence from Physical Simulators and World Models
A Survey: Learning Embodied Intelligence from Physical Simulators and World Models
Xiaoxiao Long
Qingrui Zhao
Kaiwen Zhang
Zihao Zhang
Dingrui Wang
...
Jia Pan
Qiu Shen
Ruigang Yang
X. Cao
Qionghai Dai
LM&RoAI4CE
301
19
0
01 Jul 2025
Goal-VLA: Image-Generative VLMs as Object-Centric World Models Empowering Zero-shot Robot Manipulation
Goal-VLA: Image-Generative VLMs as Object-Centric World Models Empowering Zero-shot Robot Manipulation
Haonan Chen
Jingxiang Guo
Bangjun Wang
Tianrui Zhang
Xuchuan Huang
Boren Zheng
Yiwen Hou
Chenrui Tie
Jiajun Deng
Lin Shao
VGenLM&RoSyDa
174
2
0
30 Jun 2025
SafeMimic: Towards Safe and Autonomous Human-to-Robot Imitation for Mobile Manipulation
SafeMimic: Towards Safe and Autonomous Human-to-Robot Imitation for Mobile Manipulation
Arpit Bahety
Arnav Balaji
Ben Abbatematteo
Roberto Martín-Martín
145
3
0
18 Jun 2025
WorldPrediction: A Benchmark for High-level World Modeling and Long-horizon Procedural Planning
Delong Chen
Willy Chung
Yejin Bang
Ziwei Ji
Pascale Fung
VGenLM&Ro
250
6
0
04 Jun 2025
What Do Latent Action Models Actually Learn?
What Do Latent Action Models Actually Learn?International Conference on Learning Representations (ICLR), 2024
Chuheng Zhang
Tim Pearce
Pushi Zhang
Kaixin Wang
Xiaoyu Chen
Wei Shen
Li Zhao
Jiang Bian
174
7
0
27 May 2025
OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation
OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation
Raktim Gautam Goswami
Prashanth Krishnamurthy
Yann LeCun
Farshad Khorrami
VGenOffRL
268
5
0
26 May 2025
WorldEval: World Model as Real-World Robot Policies Evaluator
WorldEval: World Model as Real-World Robot Policies Evaluator
Yaxuan Li
Yichen Zhu
Junjie Wen
Chaomin Shen
Yi Xu
OffRLVGen
194
0
0
25 May 2025
Imagine Beyond! Distributionally Robust Auto-Encoding for State Space Coverage in Online Reinforcement Learning
Imagine Beyond! Distributionally Robust Auto-Encoding for State Space Coverage in Online Reinforcement Learning
Nicolas Castanet
Olivier Sigaud
Sylvain Lamprier
OffRL
443
0
0
23 May 2025
TeleOpBench: A Simulator-Centric Benchmark for Dual-Arm Dexterous Teleoperation
TeleOpBench: A Simulator-Centric Benchmark for Dual-Arm Dexterous Teleoperation
Hangyu Li
Qin Zhao
Haoran Xu
Xinyu Jiang
Qingwei Ben
...
Jia Zeng
Hanqing Wang
Bo Dai
Junting Dong
Jiangmiao Pang
467
4
0
19 May 2025
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations
Jiahui Zhang
Yusen Luo
Abrar Anwar
Sumedh Anand Sontakke
Joseph J Lim
Jesse Thomason
Erdem Biyik
Jesse Zhang
OffRLLM&Ro
420
18
0
16 May 2025
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
UniVLA: Learning to Act Anywhere with Task-centric Latent ActionsRobotics (RAS), 2025
Qingwen Bu
Yanting Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
889
102
0
09 May 2025
PIN-WM: Learning Physics-INformed World Models for Non-Prehensile Manipulation
PIN-WM: Learning Physics-INformed World Models for Non-Prehensile Manipulation
Wenxuan Li
Hang Zhao
Zhiyuan Yu
Yu Du
Qin Zou
Ruizhen Hu
K. Xu
SSL
412
7
0
23 Apr 2025
Novel Diffusion Models for Multimodal 3D Hand Trajectory Prediction
Novel Diffusion Models for Multimodal 3D Hand Trajectory Prediction
Junyi Ma
Wentao Bao
Jingyi Xu
Guanzhong Sun
Xieyuanli Chen
Hesheng Wang
233
4
0
10 Apr 2025
ZeroMimic: Distilling Robotic Manipulation Skills from Web Videos
ZeroMimic: Distilling Robotic Manipulation Skills from Web VideosIEEE International Conference on Robotics and Automation (ICRA), 2025
Junyao Shi
Zhuolun Zhao
Tianyou Wang
Ian Pedroza
Amy Luo
Jie Wang
Jason Ma
Dinesh Jayaraman
LM&Ro
283
13
0
31 Mar 2025
PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model
PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction ModelComputer Vision and Pattern Recognition (CVPR), 2025
Mingju Gao
Yike Pan
Huan-ang Gao
Zongzheng Zhang
Wenyi Li
Hao Dong
Hao Tang
Li Yi
Hao Zhao
VGen
255
6
0
25 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
555
35
0
24 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
Shanghang Zhang
622
95
0
13 Mar 2025
LuciBot: Automated Robot Policy Learning from Generated Videos
Xiaowen Qiu
Yian Wang
Jiting Cai
Zhehuan Chen
Chunru Lin
Tsun-Hsuan Wang
Chuang Gan
LM&RoVGen
316
2
0
12 Mar 2025
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Jian Zhu
Zhengyu Jia
Tian Gao
Jiaxin Deng
Shidi Li
Han Li
Fu Liu
Xianpeng Lang
Xiaolong Sun
VGen
966
4
0
12 Mar 2025
Cross-Embodiment Robotic Manipulation Synthesis via Guided Demonstrations through CycleVAE and Human Behavior Transformer
Apan Dastider
Hao Fang
Mingjie Lin
166
0
0
11 Mar 2025
Toward Stable World Models: Measuring and Addressing World Instability in Generative Environments
Soonwoo Kwon
Jin-Young Kim
Hyojun Go
Kyungjune Baek
272
2
0
11 Mar 2025
Four Principles for Physically Interpretable World Models
Four Principles for Physically Interpretable World Models
Jordan Peper
Zhenjiang Mao
Yuang Geng
Siyuan Pan
Ivan Ruchkin
427
5
0
04 Mar 2025
Exo-ViHa: A Cross-Platform Exoskeleton System with Visual and Haptic Feedback for Efficient Dexterous Skill Learning
Xintao Chao
Shilong Mu
Yushan Liu
Shoujie Li
Chuqiao Lyu
Xiao-Ping Zhang
Wenbo Ding
299
1
0
03 Mar 2025
Magma: A Foundation Model for Multimodal AI Agents
Magma: A Foundation Model for Multimodal AI AgentsComputer Vision and Pattern Recognition (CVPR), 2025
Jianwei Yang
Reuben Tan
Qianhui Wu
Ruijie Zheng
Baolin Peng
...
Seonghyeon Ye
Joel Jang
Yuquan Deng
Lars Liden
Jianfeng Gao
VLMAI4TS
355
93
0
18 Feb 2025
Learning from Massive Human Videos for Universal Humanoid Pose Control
Learning from Massive Human Videos for Universal Humanoid Pose Control
Jiageng Mao
Siheng Zhao
Siqi Song
Tianheng Shi
Junjie Ye
Mingtong Zhang
Haoran Geng
Jitendra Malik
Vitor Campagnolo Guizilini
Yue Wang
338
23
0
18 Dec 2024
Reinforcement Learning from Wild Animal Videos
Reinforcement Learning from Wild Animal Videos
Elliot Chane-Sane
Constant Roux
O. Stasse
Nicolas Mansard
952
1
0
05 Dec 2024
CogACT: A Foundational Vision-Language-Action Model for Synergizing
  Cognition and Action in Robotic Manipulation
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Qixiu Li
Yaobo Liang
Zeyu Wang
Lin Luo
Xi Chen
...
Jianmin Bao
Dong Chen
Yuanchun Shi
Jiaolong Yang
B. Guo
LM&Ro
354
187
0
29 Nov 2024
Understanding World or Predicting Future? A Comprehensive Survey of World Models
Understanding World or Predicting Future? A Comprehensive Survey of World ModelsACM Computing Surveys (ACM CSUR), 2024
Jingtao Ding
Yunke Zhang
Yu Shang
Yuheng Zhang
Zefang Zong
...
Fengli Xu
Yong Li
Chen Gao
Fengli Xu
Yong Li
VGenSyDa
517
17
0
21 Nov 2024
Grounding Video Models to Actions through Goal Conditioned Exploration
Grounding Video Models to Actions through Goal Conditioned ExplorationInternational Conference on Learning Representations (ICLR), 2024
Yunhao Luo
Yilun Du
LM&RoVGen
405
21
0
11 Nov 2024
Autoregressive Models in Vision: A Survey
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
489
38
0
08 Nov 2024
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning
G. Zhou
Hengkai Pan
Yann LeCun
Lerrel Pinto
VGenLM&RoOffRL
375
103
0
07 Nov 2024
STEER: Flexible Robotic Manipulation via Dense Language Grounding
STEER: Flexible Robotic Manipulation via Dense Language GroundingIEEE International Conference on Robotics and Automation (ICRA), 2024
Laura Smith
A. Irpan
Montserrat Gonzalez Arenas
Sean Kirmani
Dmitry Kalashnikov
Dhruv Shah
Ted Xiao
LLMSV
295
7
0
05 Nov 2024
Pre-trained Visual Dynamics Representations for Efficient Policy
  Learning
Pre-trained Visual Dynamics Representations for Efficient Policy LearningEuropean Conference on Computer Vision (ECCV), 2024
Hao Luo
Bohan Zhou
Zongqing Lu
267
4
0
05 Nov 2024
12
Next