ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.16815
  4. Cited By
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
v1v2 (latest)

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

22 July 2025
Chi-Pin Huang
Yueh-Hua Wu
Min-Hung Chen
Yu-Chun Wang
Fu-En Yang
    LM&RoLRM
ArXiv (abs)PDFHTMLHuggingFace (31 upvotes)

Papers citing "ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning"

23 / 23 papers shown
Title
RynnVLA-002: A Unified Vision-Language-Action and World Model
RynnVLA-002: A Unified Vision-Language-Action and World Model
Jun Cen
Siteng Huang
Yuqian Yuan
Kehan Li
Hangjie Yuan
...
Xin Li
Hao Luo
Fan Wang
Deli Zhao
H. Chen
VGenSyDa
281
0
0
21 Nov 2025
VLA-Pruner: Temporal-Aware Dual-Level Visual Token Pruning for Efficient Vision-Language-Action Inference
VLA-Pruner: Temporal-Aware Dual-Level Visual Token Pruning for Efficient Vision-Language-Action Inference
Ziyan Liu
Y. Chen
Hongyi Cai
Tao Lin
Shuo Yang
Zheng Liu
Bo Zhao
VLM
251
0
0
20 Nov 2025
SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models
SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models
Senyu Fei
Siyin Wang
Li Ji
Ao Li
Shiduo Zhang
Liming Liu
Jinlong Hou
Jingjing Gong
Xianzhong Zhao
Xipeng Qiu
90
0
0
19 Nov 2025
LongInsightBench: A Comprehensive Benchmark for Evaluating Omni-Modal Models on Human-Centric Long-Video Understanding
LongInsightBench: A Comprehensive Benchmark for Evaluating Omni-Modal Models on Human-Centric Long-Video Understanding
Zhaoyang Han
Qihan Lin
Hao Liang
Bowen Chen
Zhou Liu
Wentao Zhang
VLM
171
0
0
20 Oct 2025
X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model
X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model
Jinliang Zheng
Jianxiong Li
Zhihao Wang
Dongxiu Liu
Xirui Kang
...
Ya-Qin Zhang
Jiangmiao Pang
Jingjing Liu
Tai Wang
Xianyuan Zhan
LM&Ro
212
6
0
11 Oct 2025
Bridge Thinking and Acting: Unleashing Physical Potential of VLM with Generalizable Action Expert
Bridge Thinking and Acting: Unleashing Physical Potential of VLM with Generalizable Action Expert
Mingyu Liu
Zheng Huang
Xiaoyi Lin
Huanyi Zheng
Canyu Zhao
Zongze Du
Y. Wang
Haoyi Zhu
Hao Chen
Chunhua Shen
125
0
0
04 Oct 2025
LIBERO-PRO: Towards Robust and Fair Evaluation of Vision-Language-Action Models Beyond Memorization
LIBERO-PRO: Towards Robust and Fair Evaluation of Vision-Language-Action Models Beyond Memorization
Xueyang Zhou
Yangming Xu
Guiyao Tie
Yongchao Chen
Guowen Zhang
Duanfeng Chu
Pan Zhou
Lichao Sun
AAML
168
6
0
04 Oct 2025
FailSafe: Reasoning and Recovery from Failures in Vision-Language-Action Models
FailSafe: Reasoning and Recovery from Failures in Vision-Language-Action Models
Zijun Lin
Jiafei Duan
Haoquan Fang
Dieter Fox
Ranjay Krishna
Cheston Tan
Bihan Wen
273
1
0
02 Oct 2025
Hybrid Training for Vision-Language-Action Models
Hybrid Training for Vision-Language-Action Models
Pietro Mazzaglia
Cansu Sancaktar
Markus Peschl
Daniel Dijkman
LM&RoLRM
116
1
0
01 Oct 2025
MoWM: Mixture-of-World-Models for Embodied Planning via Latent-to-Pixel Feature Modulation
MoWM: Mixture-of-World-Models for Embodied Planning via Latent-to-Pixel Feature Modulation
Yu Shang
Yangcheng Yu
Xin Zhang
Xin Jin
Haisheng Su
Wei Wu
Yong Li
VGen
147
1
0
26 Sep 2025
UML-CoT: Structured Reasoning and Planning with Unified Modeling Language for Robotic Room Cleaning
UML-CoT: Structured Reasoning and Planning with Unified Modeling Language for Robotic Room Cleaning
Hongyu Chen
Guangrun Wang
LM&RoLRM
121
0
0
26 Sep 2025
ThinkFake: Reasoning in Multimodal Large Language Models for AI-Generated Image Detection
ThinkFake: Reasoning in Multimodal Large Language Models for AI-Generated Image Detection
Tai-Ming Huang
Wei-Tung Lin
Kai-Lung Hua
Wen-Huang Cheng
Junichi Yamagishi
Jun-Cheng Chen
OffRLLRM
116
3
0
24 Sep 2025
Pure Vision Language Action (VLA) Models: A Comprehensive Survey
Pure Vision Language Action (VLA) Models: A Comprehensive Survey
Dapeng Zhang
Jin Sun
Chenghui Hu
Xiaoyan Wu
Zhenlong Yuan
R. Zhou
Fei Shen
Qingguo Zhou
LM&Ro
241
15
0
23 Sep 2025
PEEK: Guiding and Minimal Image Representations for Zero-Shot Generalization of Robot Manipulation Policies
PEEK: Guiding and Minimal Image Representations for Zero-Shot Generalization of Robot Manipulation Policies
Jesse Zhang
Marius Memmel
Kevin Kim
Dieter Fox
Jesse Thomason
Fabio Ramos
Erdem Bıyık
Abhishek Gupta
Anqi Li
LM&Ro
97
1
0
22 Sep 2025
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Yihao Wang
Pengxiang Ding
Lingxiao Li
Can Cui
Zirui Ge
...
Yifan Tang
Wenhui Wang
Ru Zhang
Jianyi Liu
Donglin Wang
252
22
0
11 Sep 2025
FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction
FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction
Yifan Yang
Zhixiang Duan
Tianshi Xie
Fuyu Cao
Pinxi Shen
...
Piaopiao Jin
Guokang Sun
Shaoqing Xu
Yangwei You
Jingtai Liu
163
3
0
04 Sep 2025
Planning with Reasoning using Vision Language World Model
Planning with Reasoning using Vision Language World Model
Delong Chen
Theo Moutakanni
Willy Chung
Yejin Bang
Ziwei Ji
Allen Bolourchi
Pascale Fung
VGenVLM
207
9
0
02 Sep 2025
EO-1: Interleaved Vision-Text-Action Pretraining for General Robot Control
EO-1: Interleaved Vision-Text-Action Pretraining for General Robot Control
Delin Qu
Haoming Song
Qizhi Chen
Zhaoqing Chen
Xianqiang Gao
...
Maoqing Yao
Haoran Yang
Jiacheng Bao
Jiangwei Zhong
Dong Wang
LM&Ro
294
5
0
28 Aug 2025
FlowVLA: Visual Chain of Thought-based Motion Reasoning for Vision-Language-Action Models
FlowVLA: Visual Chain of Thought-based Motion Reasoning for Vision-Language-Action Models
Zhide Zhong
Haodong Yan
Junfeng Li
Xiangchen Liu
Xin Gong
...
Wenxuan Song
Jiayi Chen
Xinhu Zheng
Hesheng Wang
Haoang Li
LRMVGen
196
3
0
25 Aug 2025
Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation
Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation
Yifu Yuan
Haiqin Cui
Yaoting Huang
Yibin Chen
Fei Ni
Zibin Dong
Pengyi Li
Yan Zheng
Jianye Hao
LM&Ro
176
14
0
19 Aug 2025
MolmoAct: Action Reasoning Models that can Reason in Space
MolmoAct: Action Reasoning Models that can Reason in Space
Jason Lee
Jiafei Duan
Haoquan Fang
Yuquan Deng
Shuo Liu
...
Karen Farley
Eli VanderBilt
Ali Farhadi
Dieter Fox
Ranjay Krishna
LM&RoLRM
389
46
0
11 Aug 2025
Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse
Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse
Zhekai Duan
Yuan Zhang
Shikai Geng
Gaowen Liu
Joschka Boedecker
Chris Xiaoxuan Lu
LRM
247
9
0
09 Jun 2025
Continual Learning for Multiple Modalities
Continual Learning for Multiple Modalities
Hyundong Jin
Eunwoo Kim
CLL
422
0
0
11 Mar 2025
1