ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.03052
  4. Cited By
Visual Program Distillation: Distilling Tools and Programmatic Reasoning
  into Vision-Language Models
v1v2 (latest)

Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models

Computer Vision and Pattern Recognition (CVPR), 2023
5 December 2023
Yushi Hu
Otilia Stretcu
Chun-Ta Lu
Krishnamurthy Viswanathan
Kenji Hata
Enming Luo
Ranjay Krishna
Ariel Fuxman
    VLMLRMMLLM
ArXiv (abs)PDFHTMLGithub

Papers citing "Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models"

27 / 27 papers shown
Self-Improving VLM Judges Without Human Annotations
Self-Improving VLM Judges Without Human Annotations
Inna Wanyin Lin
Yushi Hu
Shuyue Stella Li
Scott Geng
Pang Wei Koh
Luke Zettlemoyer
Tim Althoff
Marjan Ghazvininejad
VLMLRM
51
3
0
02 Dec 2025
LAST: LeArning to Think in Space and Time for Generalist Vision-Language Models
LAST: LeArning to Think in Space and Time for Generalist Vision-Language Models
Shuai Wang
D. Zhang
Tianyi Bai
Shitong Shao
Jiebo Luo
Jiaheng Wei
VLM
213
1
0
24 Nov 2025
DuoTeach: Dual Role Self-Teaching for Coarse-to-Fine Decision Coordination in Vision--Language Models
DuoTeach: Dual Role Self-Teaching for Coarse-to-Fine Decision Coordination in Vision--Language Models
Wei Yang
Yiran Zhu
Zilin Li
Xunjia Zhang
Hongtao Wang
Hongtao Wang
176
0
0
23 Nov 2025
Online In-Context Distillation for Low-Resource Vision Language Models
Online In-Context Distillation for Low-Resource Vision Language Models
Zhiqi Kang
Rahaf Aljundi
Vaggelis Dorovatas
Karteek Alahari
VLM
167
1
0
20 Oct 2025
Pursuing Minimal Sufficiency in Spatial Reasoning
Pursuing Minimal Sufficiency in Spatial Reasoning
Yejie Guo
Yunzhong Hou
Wufei Ma
Meng Tang
Ming-Hsuan Yang
LRM
145
1
0
19 Oct 2025
Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools
Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools
Zhenlong Yuan
Xiangyan Qu
Chengxuan Qian
Rui Chen
Jing Tang
...
Xiangxiang Chu
Dapeng Zhang
Yiwei Wang
Y. Cai
Shuo Li
VLMLRM
204
21
0
09 Oct 2025
ExPO-HM: Learning to Explain-then-Detect for Hateful Meme Detection
ExPO-HM: Learning to Explain-then-Detect for Hateful Meme Detection
Jingbiao Mei
Mingsheng Sun
Jinghong Chen
Pengda Qin
Yuhong Li
Da Chen
Bill Byrne
200
3
0
08 Oct 2025
Look Less, Reason More: Rollout-Guided Adaptive Pixel-Space Reasoning
Look Less, Reason More: Rollout-Guided Adaptive Pixel-Space Reasoning
Xuchen Li
Xuzhao Li
Jiahui Gao
Renjie Pi
Shiyu Hu
Wentao Zhang
VLMLRM
275
8
0
02 Oct 2025
From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models
From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models
Chenyue Zhou
Mingxuan Wang
Yanbiao Ma
Chenxu Wu
Wanyi Chen
...
Guoli Jia
Lingling Li
Z. Lu
Y. Lu
Wenhan Luo
LRM
638
14
0
29 Sep 2025
Learning in an Echo Chamber: Online Learning with Replay Adversary
Learning in an Echo Chamber: Online Learning with Replay Adversary
Daniil Dmitriev
Harald Eskelund Franck
Carolin Heinzler
Amartya Sanyal
123
1
0
29 Sep 2025
Reinforced Visual Perception with Tools
Reinforced Visual Perception with Tools
Zetong Zhou
Dongping Chen
Zixian Ma
Zhihan Hu
Mingyang Fu
Sinan Wang
Yao Wan
Zhou Zhao
Ranjay Krishna
OffRLVLMLRM
202
20
0
01 Sep 2025
Explain Before You Answer: A Survey on Compositional Visual Reasoning
Explain Before You Answer: A Survey on Compositional Visual Reasoning
Fucai Ke
Joy Hsu
Zhixi Cai
Zixian Ma
Xin Zheng
...
P. D. Haghighi
Gholamreza Haffari
Ranjay Krishna
Jiajun Wu
H. Rezatofighi
ReLMCoGeLRM
419
15
0
24 Aug 2025
Uni-cot: Towards Unified Chain-of-Thought Reasoning Across Text and Vision
Uni-cot: Towards Unified Chain-of-Thought Reasoning Across Text and Vision
Luozheng Qin
Jia Gong
Yuqing Sun
Tianjiao Li
Mengping Yang
Xiaomeng Yang
Chao Qu
Zhiyu Tan
Hao Li
MLLMLRM
337
0
0
07 Aug 2025
Trade-offs in Image Generation: How Do Different Dimensions Interact?
Trade-offs in Image Generation: How Do Different Dimensions Interact?
Sicheng Zhang
Binzhu Xie
Zhonghao Yan
Yuli Zhang
Donghao Zhou
Xiaofei Chen
Shi Qiu
Jiaqi Liu
Guoyang Xie
Zhichao Lu
256
3
0
29 Jul 2025
Augmented Vision-Language Models: A Systematic Review
Augmented Vision-Language Models: A Systematic Review
Anthony C Davis
Burhan Sadiq
Tianmin Shu
Chien-Ming Huang
VLMLRM
228
0
0
24 Jul 2025
MathOPEval: A Fine-grained Evaluation Benchmark for Visual Operations of MLLMs in Mathematical Reasoning
MathOPEval: A Fine-grained Evaluation Benchmark for Visual Operations of MLLMs in Mathematical Reasoning
Xiaoyuan Li
Moxin Li
Wenjie Wang
Rui Men
Yichang Zhang
Fuli Feng
Dayiheng Liu
LRM
310
3
0
24 Jul 2025
Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification
Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification
Tianyi Bai
Zengjie Hu
Fupeng Sun
Jiantao Qiu
Yizhen Jiang
Guangxin He
Bohan Zeng
Conghui He
Binhang Yuan
Wentao Zhang
OffRLLRM
230
17
0
08 Jun 2025
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning
Alex Su
Haozhe Wang
Weiming Ren
Fangzhen Lin
Lei Ma
MLLMOffRLLRMVLM
381
165
0
21 May 2025
MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
Mugilan Ganesan
Siyang Song
Ankur Aggarwal
Nish Sinnadurai
Sean Lie
Vithursan Thangarasa
VLM
501
0
0
15 May 2025
Visually Interpretable Subtask Reasoning for Visual Question Answering
Visually Interpretable Subtask Reasoning for Visual Question Answering
Yu Cheng
A. Goel
Hakan Bilen
LRM
286
2
0
12 May 2025
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Fucai Ke
Vijay Kumar B G
Xingjian Leng
Zhixi Cai
Zaid Khan
Weiqing Wang
P. D. Haghighi
H. Rezatofighi
Manmohan Chandraker
627
7
0
25 Mar 2025
OWLViz: An Open-World Benchmark for Visual Question Answering
OWLViz: An Open-World Benchmark for Visual Question Answering
T. Nguyen
Dang Nguyen
Hoang Nguyen
Thuan Luong
Long Hoang Dang
Viet Dac Lai
VLM
367
0
0
04 Mar 2025
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language EmbeddingComputer Vision and Pattern Recognition (CVPR), 2024
Chenxin Tao
Shiqian Su
X. Zhu
Chenyu Zhang
Zhe Chen
...
Wenhai Wang
Lewei Lu
Gao Huang
Yu Qiao
Jifeng Dai
MLLMVLM
598
7
0
20 Dec 2024
VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward ModelsComputer Vision and Pattern Recognition (CVPR), 2024
Lei Li
Y. X. Wei
Zhihui Xie
Xuqing Yang
Yifan Song
...
Tianyu Liu
Sujian Li
Bill Yuchen Lin
Dianbo Sui
Qiang Liu
VLMCoGe
657
75
0
26 Nov 2024
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
Duo Wu
Jiangming Wang
Yuan Meng
Yanning Zhang
Le Sun
Zhi Wang
1.3K
8
0
25 Nov 2024
Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks
Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks
Mengzhao Jia
Wenhao Yu
Kaixin Ma
Tianqing Fang
Z. Zhang
Siru Ouyang
Hongming Zhang
Meng Jiang
Dong Yu
VLM
431
16
0
02 Oct 2024
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Timin Gao
Peixian Chen
Mengdan Zhang
Chaoyou Fu
Chunjiang Ge
...
Shengchuan Zhang
Xiawu Zheng
Xing Sun
Liujuan Cao
Rongrong Ji
MLLMLRM
403
59
0
24 Apr 2024
1
Page 1 of 1