UniVLA: Learning to Act Anywhere with Task-centric Latent ActionsRobotics (RAS), 2025 |
Solving New Tasks by Adapting Internet Video KnowledgeInternational Conference on Learning Representations (ICLR), 2025 |
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action ModelsComputer Vision and Pattern Recognition (CVPR), 2025 |
Latent Action Pretraining from VideosInternational Conference on Learning Representations (ICLR), 2024 |
CogVideoX: Text-to-Video Diffusion Models with An Expert TransformerInternational Conference on Learning Representations (ICLR), 2024 Zhuoyi Yang Jiayan Teng Wendi Zheng Ming Ding Shiyu Huang ...Weihan Wang Yean Cheng Xiaotao Gu Yuxiao Dong Jie Tang |