Learning to Act from Actionless Videos through Dense Correspondences

International Conference on Learning Representations (ICLR), 2023

12 October 2023

Jiayuan Mao

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Learning to Act from Actionless Videos through Dense Correspondences"

50 / 69 papers shown

Video2Act: A Dual-System Video Diffusion Policy with Robotic Spatio-Motional Modeling

298

02 Dec 2025

IGen: Scalable Data Generation for Robot Learning from Open-World Images

...

145

01 Dec 2025

TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos

...

107

26 Nov 2025

ViPRA: Video Prediction for Robot Actions

230

11 Nov 2025

Simulating the Visual World with Artificial Intelligence: A Roadmap

464

11 Nov 2025

Robot Learning from a Physical World Model

...

Vitor Campagnolo Guizilini

Zhengyu Ma

Yue Wang

VGen PINN

421

10 Nov 2025

A Step Toward World Models: A Survey on Robotic Manipulation

745

31 Oct 2025

World-in-World: World Models in a Closed-Loop World

...

234

20 Oct 2025

Implicit State Estimation via Video Replanning

120

20 Oct 2025

MoMaps: Semantics-Aware Scene Motion Generation with Motion Maps

121

13 Oct 2025

When a Robot is More Capable than a Human: Learning from Constrained Demonstrators

10 Oct 2025

An approach for systematic decomposition of complex llm tasks

148

09 Oct 2025

Vision-Language-Action Models for Robotics: A Review Towards Real-World ApplicationsIEEE Access (IEEE Access), 2025

261

08 Oct 2025

Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer

Maxence Lasbordes

Sinoué Gad

134

07 Oct 2025

MultiModal Action Conditioned Video Generation

Yichen Li

Antonio Torralba

VGen

184

02 Oct 2025

PoseDiff: A Unified Diffusion Model Bridging Robot Pose Estimation and Video-to-Action Control

176

29 Sep 2025

Robot Learning from Any Images

...

Vitor Campagnolo Guizilini

Yue Wang

168

26 Sep 2025

Pixel Motion Diffusion is What We Need for Robot Control

140

26 Sep 2025

From Watch to Imagine: Steering Long-horizon Manipulation via Human Demonstration and Future Envisionment

186

26 Sep 2025

WoW: Towards a World omniscient World model Through Embodied Interaction

...

164

26 Sep 2025

VLBiMan: Vision-Language Anchored One-Shot Demonstration Enables Generalizable Bimanual Robotic Manipulation

Huayi Zhou

Kui Jia

LM&Ro

191

26 Sep 2025

What Happens Next? Anticipating Future Motion by Generating Point Trajectories

113

25 Sep 2025

Pure Vision Language Action (VLA) Models: A Comprehensive Survey

295

23 Sep 2025

Generative Visual Foresight Meets Task-Agnostic Pose Estimation in Robotic Table-Top Manipulation

192

30 Aug 2025

Learning Primitive Embodied World Models: Towards Scalable Robotic Learning

...

409

28 Aug 2025

Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning

...

21 Aug 2025

Precise Action-to-Video Generation Through Visual Action Prompts

125

18 Aug 2025

GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning

14 Aug 2025

Boosting Action-Information via a Variational Bottleneck on Unlabelled Robot Videos

Haoyu Zhang

Long Cheng

SSL

105

12 Aug 2025

VLM-SFD: VLM-Assisted Siamese Flow Diffusion Framework for Dual-Arm Cooperative ManipulationIEEE Robotics and Automation Letters (IEEE RA-L), 2025

140

16 Jun 2025

Self-Adapting Improvement Loops for Robotic Learning

162

07 Jun 2025

3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model

387

06 Jun 2025

Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

254

02 Jun 2025

Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction

1.1K

30 May 2025

Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

341

27 May 2025

TeViR: Text-to-Video Reward with Diffusion Models for Efficient Reinforcement Learning

267

26 May 2025

RLVR-World: Training World Models with Reinforcement Learning

496

20 May 2025

DreamGen: Unlocking Generalization in Robot Learning through Video World Models

...

392

19 May 2025

Extracting Visual Plans from Unlabeled Videos via Symbolic Guidance

323

13 May 2025

LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation

525

13 May 2025

Pixel Motion as Universal Representation for Robot Control

391

12 May 2025

VISTA: Generative Visual Imagination for Vision-and-Language Navigation

575

09 May 2025

CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations

459

08 May 2025

Solving New Tasks by Adapting Internet Video KnowledgeInternational Conference on Learning Representations (ICLR), 2025

235

21 Apr 2025

FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models

346

20 Apr 2025

Diffusion Models for Robotic Manipulation: A SurveyFrontiers in Robotics and AI (Front. Robot. AI), 2025

512

11 Apr 2025

AdaWorld: Learning Adaptable World Models with Latent Actions

555

24 Mar 2025

Unified Video Action Model

685

28 Feb 2025

Self-Consistent Model-based Adaptation for Visual Reinforcement LearningInternational Joint Conference on Artificial Intelligence (IJCAI), 2025

225

17 Feb 2025

VILP: Imitation Learning with Latent Video PlanningIEEE Robotics and Automation Letters (IEEE RA-L), 2025

280

03 Feb 2025