ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.11559
  4. Cited By
Visual Programming: Compositional visual reasoning without training

Visual Programming: Compositional visual reasoning without training

Computer Vision and Pattern Recognition (CVPR), 2022
18 November 2022
Tanmay Gupta
Aniruddha Kembhavi
    ReLMVLMLRM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Visual Programming: Compositional visual reasoning without training"

50 / 381 papers shown
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom
Yisen Li
Lingfeng Yang
Wenxuan Shen
Pan Zhou
Yao Wan
Weiwei Lin
Benlin Liu
267
4
0
03 Mar 2025
Program Synthesis Dialog Agents for Interactive Decision-Making
Program Synthesis Dialog Agents for Interactive Decision-Making
Matthew Toles
Nikhil Balwani
Rattandeep Singh
Valentina Giulia Sartori Rodriguez
Zhou Yu
407
0
0
26 Feb 2025
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMsInternational Conference on Learning Representations (ICLR), 2025
Jiarui Zhang
Mahyar Khayatkhoei
P. Chhikara
Filip Ilievski
LRM
289
78
0
24 Feb 2025
NAVIG: Natural Language-guided Analysis with Vision Language Models for Image Geo-localization
NAVIG: Natural Language-guided Analysis with Vision Language Models for Image Geo-localization
Zheyuan Zhang
Runze Li
Tasnim Kabir
Jordan Boyd-Graber
203
4
0
21 Feb 2025
MoVer: Motion Verification for Motion Graphics Animations
MoVer: Motion Verification for Motion Graphics AnimationsACM Transactions on Graphics (TOG), 2025
Jiaju Ma
Maneesh Agrawala
VGen
313
7
0
19 Feb 2025
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
Zeqing Wang
Wentao Wan
Qiqing Lao
Runmeng Chen
Minjie Lang
Keze Wang
Liang Lin
Guanbin Li
LRM
445
5
0
17 Feb 2025
DiSciPLE: Learning Interpretable Programs for Scientific Visual DiscoveryComputer Vision and Pattern Recognition (CVPR), 2025
Utkarsh Mall
Cheng Perng Phoo
Mia Chiquier
Bharath Hariharan
Kavita Bala
Carl Vondrick
455
3
0
17 Feb 2025
Language-to-Space Programming for Training-Free 3D Visual Grounding
Language-to-Space Programming for Training-Free 3D Visual Grounding
Boyu Mi
Hanqing Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
584
1
0
03 Feb 2025
VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
Chunbai Zhang
Chunbai Zhang
Yang Zhou
Yang Zhou
Yan Peng
LRMReLM
416
1
0
02 Feb 2025
Position: AI Scaling: From Up to Down and Out
Position: AI Scaling: From Up to Down and Out
Yunke Wang
Yanxi Li
Chang Xu
HAI
519
1
0
02 Feb 2025
PuzzleGPT: Emulating Human Puzzle-Solving Ability for Time and Location Prediction
PuzzleGPT: Emulating Human Puzzle-Solving Ability for Time and Location PredictionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Hammad A. Ayyubi
Xuande Feng
Junzhang Liu
Xudong Lin
Zhecan Wang
Shih-Fu Chang
168
1
0
24 Jan 2025
Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering
Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering
Qian Tao
Xiaoyang Fan
Yong Xu
Xingquan Zhu
Yufei Tang
229
0
0
22 Jan 2025
Neuro-Symbolic AI in 2024: A Systematic Review
Neuro-Symbolic AI in 2024: A Systematic Review
Brandon C. Colelough
William Regli
NAI
678
38
0
09 Jan 2025
AutoPresent: Designing Structured Visuals from Scratch
AutoPresent: Designing Structured Visuals from ScratchComputer Vision and Pattern Recognition (CVPR), 2025
Jiaxin Ge
Zora Z. Wang
Xuhui Zhou
Yi-Hao Peng
Sanjay Subramanian
...
Maarten Sap
Alane Suhr
Daniel Fried
Graham Neubig
Trevor Darrell
278
8
0
01 Jan 2025
GAIS: A Novel Approach to Instance Selection with Graph Attention
  Networks
GAIS: A Novel Approach to Instance Selection with Graph Attention Networks
Zahiriddin Rustamov
Ayham Zaitouny
Rafat Damseh
Nazar Zaki
283
3
0
26 Dec 2024
Relational Programming with Foundation Models
Relational Programming with Foundation ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024
Ziyang Li
Jiani Huang
Jason Liu
Felix Zhu
Eric Zhao
William Dodds
Neelay Velingker
Rajeev Alur
Mayur Naik
313
10
0
19 Dec 2024
CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers
CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers
Dimitrios Mallis
Ahmet Serdar Karadeniz
Sebastian Cavada
Danila Rukhovich
Niki Maria Foteinopoulou
K. Cherenkova
Anis Kacem
Djamila Aouada
604
15
0
18 Dec 2024
Empowering LLMs to Understand and Generate Complex Vector Graphics
Empowering LLMs to Understand and Generate Complex Vector GraphicsComputer Vision and Pattern Recognition (CVPR), 2024
Ximing Xing
Juncheng Hu
Guotao Liang
Jing Zhang
Dong Xu
Qian Yu
531
30
0
15 Dec 2024
Olympus: A Universal Task Router for Computer Vision Tasks
Olympus: A Universal Task Router for Computer Vision TasksComputer Vision and Pattern Recognition (CVPR), 2024
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Juil Sock
VLMObjD
1.2K
3
0
12 Dec 2024
Language Model as Visual Explainer
Language Model as Visual ExplainerNeural Information Processing Systems (NeurIPS), 2024
Xingyi Yang
Xinchao Wang
VLM
208
1
0
08 Dec 2024
Learning to Correction: Explainable Feedback Generation for Visual
  Commonsense Reasoning Distractor
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning DistractorACM Multimedia (MM), 2024
Jiali Chen
Xusen Hei
Yuqi Xue
Yuancheng Wei
Jiayuan Xie
Yi Cai
Qing Li
MLLMLRM
323
11
0
08 Dec 2024
TANGO: Training-free Embodied AI Agents for Open-world Tasks
TANGO: Training-free Embodied AI Agents for Open-world TasksComputer Vision and Pattern Recognition (CVPR), 2024
Filippo Ziliotto
Tommaso Campari
Luciano Serafini
Lamberto Ballan
LLMAGLM&RoMLLMLRM
331
13
0
05 Dec 2024
LossAgent: Towards Any Optimization Objectives for Image Processing with LLM Agents
LossAgent: Towards Any Optimization Objectives for Image Processing with LLM Agents
Bingchen Li
Xin Li
Yiting Lu
Zhibo Chen
598
1
0
05 Dec 2024
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
Duo Wu
Jiangming Wang
Yuan Meng
Yanning Zhang
Le Sun
Zhi Wang
1.2K
3
0
25 Nov 2024
GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers
GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers
Éloi Zablocki
Valentin Gerard
Amaia Cardiel
Eric Gaussier
Matthieu Cord
Eduardo Valle
455
0
0
23 Nov 2024
Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Yongdong Luo
Xiawu Zheng
Guilin Li
Guilin Li
Haojia Lin
...
Jinfa Huang
Jiayi Ji
Jiebo Luo
Rongrong Ji
Rongrong Ji
VLM
679
68
0
20 Nov 2024
Retinal Vessel Segmentation via Neuron Programming
Tingting Wu
Ruyi Min
Peixuan Song
Hengtao Guo
Tieyong Zeng
Feng-Lei Fan
270
0
0
17 Nov 2024
Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms
Minghe Gao
Wendong Bu
Bingchen Miao
Yang Wu
Yunfei Li
Juncheng Billy Li
Siliang Tang
Qi Wu
Yueting Zhuang
Meng Wang
LM&Ro
313
7
0
17 Nov 2024
AutoVFX: Physically Realistic Video Editing from Natural Language
  Instructions
AutoVFX: Physically Realistic Video Editing from Natural Language InstructionsInternational Conference on 3D Vision (3DV), 2024
Hao-Yu Hsu
Zhi-Hao Lin
Albert Zhai
Hongchi Xia
Shenlong Wang
VGen
243
21
0
04 Nov 2024
TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos
TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos
Leonardo Plini
Luca Scofano
Edoardo De Matteis
Guido Maria DÁmely di Melendugno
Alessandro Flaborea
Andrea Sanchietti
G. Farinella
Fabio Galasso
Antonino Furnari
LRMEgoV
367
7
0
04 Nov 2024
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot PlanningInternational Conference on Learning Representations (ICLR), 2024
Yichao Liang
Nishanth Kumar
Hao Tang
Adrian Weller
J. Tenenbaum
Tom Silver
Joao Henriques
Kevin Ellis
350
27
0
30 Oct 2024
Natural Language Inference Improves Compositionality in Vision-Language
  Models
Natural Language Inference Improves Compositionality in Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Paola Cascante-Bonilla
Yu Hou
Yang Trista Cao
Hal Daumé III
Rachel Rudinger
ReLMCoGeVLM
333
5
0
29 Oct 2024
What Factors Affect Multi-Modal In-Context Learning? An In-Depth
  Exploration
What Factors Affect Multi-Modal In-Context Learning? An In-Depth ExplorationNeural Information Processing Systems (NeurIPS), 2024
L. Qin
Qiguang Chen
Hao Fei
Zhi Chen
Min Li
Wanxiang Che
207
26
0
27 Oct 2024
GRS: Generating Robotic Simulation Tasks from Real-World Images
GRS: Generating Robotic Simulation Tasks from Real-World Images
Alex Zook
Fan-Yun Sun
Josef Spjut
Valts Blukis
Stan Birchfield
Jonathan Tremblay
427
8
0
20 Oct 2024
GeoCoder: Solving Geometry Problems by Generating Modular Code through
  Vision-Language Models
GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Aditya Sharma
Aman Dalmia
Mehran Kazemi
Christopher J. Pal
Amal Zouaq
LRM
159
8
0
17 Oct 2024
Trust but Verify: Programmatic VLM Evaluation in the Wild
Trust but Verify: Programmatic VLM Evaluation in the Wild
Viraj Prabhu
Senthil Purushwalkam
An Yan
Caiming Xiong
Ran Xu
MLLM
163
2
0
17 Oct 2024
Augmenting In-Context-Learning in LLMs via Automatic Data Labeling and
  Refinement
Augmenting In-Context-Learning in LLMs via Automatic Data Labeling and Refinement
J. Shtok
Amit Alfassy
Foad Abo Dahood
Eliyahu Schwartz
Sivan Doveh
Assaf Arbelle
LRMReLM
218
0
0
14 Oct 2024
GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language
  Models Through Traversing 2D Game Maps
GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game MapsNeural Information Processing Systems (NeurIPS), 2024
Muhammad Umair Nasir
Steven D. James
Julian Togelius
ELMLRM
189
9
0
10 Oct 2024
VoxelPrompt: A Vision Agent for End-to-End Medical Image Analysis
VoxelPrompt: A Vision Agent for End-to-End Medical Image Analysis
Andrew Hoopes
Neel Dey
V. Butoi
John Guttag
Adrian V. Dalca
MedImLM&MA
403
3
0
10 Oct 2024
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
DataEnvGym: Data Generation Agents in Teacher Environments with Student FeedbackInternational Conference on Learning Representations (ICLR), 2024
Zaid Khan
Elias Stengel-Eskin
Jaemin Cho
Joey Tianyi Zhou
VGen
421
8
0
08 Oct 2024
Domain-Oriented Time Series Inference Agents for Reasoning and Automated Analysis
Domain-Oriented Time Series Inference Agents for Reasoning and Automated Analysis
Wen Ye
Wei Yang
Defu Cao
Yizhou Zhang
Lumingyuan Tang
Jie Cai
Yan Liu
AI4TSBDLCoGe
466
1
0
05 Oct 2024
Grounding Language in Multi-Perspective Referential Communication
Grounding Language in Multi-Perspective Referential CommunicationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zineng Tang
Lingjun Mao
Alane Suhr
281
6
0
04 Oct 2024
ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation
ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation
Rinon Gal
Adi Haviv
Yuval Alaluf
Amit H. Bermano
Daniel Cohen-Or
Gal Chechik
DiffM
186
8
0
02 Oct 2024
A Survey on Complex Tasks for Goal-Directed Interactive Agents
A Survey on Complex Tasks for Goal-Directed Interactive Agents
Mareike Hartmann
Alexander Koller
LM&RoLLMAG
293
1
0
27 Sep 2024
Visual Data Diagnosis and Debiasing with Concept Graphs
Visual Data Diagnosis and Debiasing with Concept GraphsNeural Information Processing Systems (NeurIPS), 2024
Rwiddhi Chakraborty
Yinong Wang
Jialu Gao
Runkai Zheng
Cheng Zhang
Fernando de la Torre
234
6
0
26 Sep 2024
Proof of Thought : Neurosymbolic Program Synthesis allows Robust and
  Interpretable Reasoning
Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning
Debargha Ganguly
Srinivasan Iyengar
Vipin Chaudhary
Shivkumar Kalyanaraman
LRM
188
7
0
25 Sep 2024
Discovering Object Attributes by Prompting Large Language Models with Perception-Action APIs
Discovering Object Attributes by Prompting Large Language Models with Perception-Action APIsIEEE International Conference on Robotics and Automation (ICRA), 2024
A. Mavrogiannis
Dehao Yuan
Yiannis Aloimonos
LM&Ro
311
2
0
23 Sep 2024
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal
  Reasoning with Large Language Models
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models
Shengsheng Qian
Zuyi Zhou
Dizhan Xue
Bing Wang
Changsheng Xu
LRM
422
5
0
19 Sep 2024
NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous
  Perception, Reasoning, and Planning in Complex UAV Search Missions
NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search MissionsIEEE Robotics and Automation Letters (RA-L), 2024
Zhixi Cai
Cristian Rojas Cardenas
Kevin Leo
Chenyuan Zhang
Kal Backman
...
Yuan-Fang Li
Mor Vered
Peter Stuckey
M. G. D. L. Banda
Hamid Rezatofighi
251
13
0
16 Sep 2024
Symbolic Regression with a Learned Concept Library
Symbolic Regression with a Learned Concept LibraryNeural Information Processing Systems (NeurIPS), 2024
Arya Grayeli
Atharva Sehgal
Omar Costilla-Reyes
Miles Cranmer
Swarat Chaudhuri
222
45
0
14 Sep 2024
Previous
12345678
Next