Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2509.07969
Cited By
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search
9 September 2025
Xin Lai
Junyi Li
Wei Li
Tao Liu
Tianjian Li
Hengshuang Zhao
LRM
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (57 upvotes)
Github (33★)
Papers citing
"Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
14 / 14 papers shown
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Shengyuan Ding
Xinyu Fang
Ziyu Liu
Yuhang Zang
Yuhang Cao
...
Jianze Liang
Bin Wang
Conghui He
Dahua Lin
Jiaqi Wang
LRM
194
0
0
04 Dec 2025
AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition
Zichuan Lin
Y. Liu
Yang Yang
Lvfang Tao
Deheng Ye
VLM
99
0
0
03 Dec 2025
Thinking with Programming Vision: Towards a Unified View for Thinking with Images
Zirun Guo
Minjie Hong
Feng Zhang
Kai Jia
Tao Jin
OffRL
LRM
VLM
204
0
0
03 Dec 2025
JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization
Yunlong Lin
Linqing Wang
Kunjie Lin
Zixu Lin
Kaixiong Gong
...
Yuyang Peng
Wenxun Dai
Xinghao Ding
C. Wang
Qinglin Lu
244
0
0
28 Nov 2025
Qwen3-VL Technical Report
Shuai Bai
Yuxuan Cai
Ruizhe Chen
Keqin Chen
Xionghui Chen
...
Jingren Zhou
F. I. S. Kevin Zhou
J. Zhou
Yuanzhi Zhu
Ke Zhu
VLM
1.6K
54
0
26 Nov 2025
Boosting Reasoning in Large Multimodal Models via Activation Replay
Yun Xing
Xiaobin Hu
Qingdong He
Jiangning Zhang
Shuicheng Yan
Shijian Lu
Yu-Gang Jiang
OffRL
LRM
237
1
0
25 Nov 2025
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation
Weijia Mao
Hao Chen
Zhenheng Yang
Mike Zheng Shou
EGVM
272
0
0
25 Nov 2025
Thinking in 360°: Humanoid Visual Search in the Wild
Heyang Yu
Yinan Han
Xiangyu Zhang
B. Yin
Bowen Chang
...
Jing Zhang
Marco Pavone
Chen Feng
Saining Xie
Yiming Li
VGen
334
1
0
25 Nov 2025
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization
Y. Wang
Z. Liu
Ziyi Wang
Pengfei Liu
Han Hu
Yongming Rao
LRM
401
0
0
19 Nov 2025
DeepEyesV2: Toward Agentic Multimodal Model
IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025
Jack Hong
Chenxiao Zhao
ChengLin Zhu
Weiheng Lu
Guohai Xu
Xing Yu
130
5
0
07 Nov 2025
TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning
Ming Li
Jike Zhong
Shitian Zhao
H. Zhang
Shaoheng Lin
Yuxiang Lai
Chen Wei
Konstantinos Psounis
Kaipeng Zhang
EGVM
LRM
VLM
462
3
0
03 Nov 2025
ViPER: Empowering the Self-Evolution of Visual Perception Abilities in Vision-Language Model
J. Zhang
Song Jin
Chuanqi Cheng
Yuhan Liu
Yankai Lin
...
Yufei Zhang
F. Jiang
G. Yin
Wei Lin
Rui Yan
VLM
213
4
0
28 Oct 2025
A Survey on Agentic Multimodal Large Language Models
Huanjin Yao
Ruifei Zhang
Jiaxing Huang
Jingyi Zhang
Yibo Wang
...
Ruolin Zhu
Yongcheng Jing
Shunyu Liu
Guanbin Li
Dacheng Tao
LM&Ro
AIFin
AI4TS
LRM
AI4CE
250
5
0
13 Oct 2025
Pathology-CoT: Learning Visual Chain-of-Thought Agent from Expert Whole Slide Image Diagnosis Behavior
Sheng Wang
Ruiming Wu
Charles Herndon
Yihang Liu
Shunsuke Koga
Jeanne Shen
Zhi Huang
159
4
0
06 Oct 2025
1