ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.07235
  4. Cited By
Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification

Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification

8 June 2025
Tianyi Bai
Zengjie Hu
Fupeng Sun
Jiantao Qiu
Yizhen Jiang
Guangxin He
Bohan Zeng
Conghui He
Binhang Yuan
Wentao Zhang
    OffRLLRM
ArXiv (abs)PDFHTML

Papers citing "Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification"

8 / 8 papers shown
Title
LAST: LeArning to Think in Space and Time for Generalist Vision-Language Models
LAST: LeArning to Think in Space and Time for Generalist Vision-Language Models
Shuai Wang
D. Zhang
Tianyi Bai
Shitong Shao
Jiebo Luo
Jiaheng Wei
VLM
134
1
0
24 Nov 2025
Octopus: Agentic Multimodal Reasoning with Six-Capability Orchestration
Octopus: Agentic Multimodal Reasoning with Six-Capability Orchestration
Yifu Guo
Zishan Xu
Zhiyuan Yao
Y. Lu
Jiaye Lin
Sen Hu
Zhenheng Tang
Y. Li
Huacan Wang
LRM
169
0
0
19 Nov 2025
Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Binxiao Xu
Junyu Feng
Ruichuan An
Yulin Luo
Shilin Yan
Hao Liang
Ming Lu
Wentao Zhang
207
0
0
26 Oct 2025
MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Xukai Wang
Xuanbo Liu
Mingrui Chen
Haitian Zhong
Xuanlin Yang
...
Xu-Yao Zhang
Qiang Liu
Zhouchen Lin
Wentao Zhang
Bin Dong
ELMLRM
164
1
0
16 Oct 2025
Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning
Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning
Xingang Guo
Utkarsh Tyagi
Advait Gosai
Paula Vergara
Ernesto Gabriel Hernández Montoya
...
Bin Hu
Yunzhong He
Bing Liu
Bing Liu
Rakshith S Srinivasa
VLMLRM
325
2
0
14 Oct 2025
From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models
From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models
Chenyue Zhou
Mingxuan Wang
Yanbiao Ma
Chenxu Wu
Wanyi Chen
...
Guoli Jia
Lingling Li
Z. Lu
Y. Lu
Wenhan Luo
LRM
435
9
0
29 Sep 2025
Simple o3: Towards Interleaved Vision-Language Reasoning
Simple o3: Towards Interleaved Vision-Language Reasoning
Ye Wang
Qianglong Chen
Zejun Li
Siyuan Wang
Shijie Guo
Zhirui Zhang
Zhongyu Wei
MLLMLRMVLM
152
12
0
16 Aug 2025
EditWorld: Simulating World Dynamics for Instruction-Following Image
  Editing
EditWorld: Simulating World Dynamics for Instruction-Following Image Editing
Ling Yang
Bo-Wen Zeng
Jiaming Liu
Hong Li
Minghao Xu
Wentao Zhang
Shuicheng Yan
DiffM
201
30
0
23 May 2024
1