Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2211.11559
Cited By
Visual Programming: Compositional visual reasoning without training
Computer Vision and Pattern Recognition (CVPR), 2022
18 November 2022
Tanmay Gupta
Aniruddha Kembhavi
ReLM
VLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Visual Programming: Compositional visual reasoning without training"
25 / 375 papers shown
Title
AmadeusGPT: a natural language interface for interactive animal behavioral analysis
Neural Information Processing Systems (NeurIPS), 2023
Shaokai Ye
Jessy Lauer
Mu Zhou
Alexander Mathis
Mackenzie W. Mathis
MLLM
LLMAG
282
25
0
10 Jul 2023
Look, Remember and Reason: Grounded reasoning in videos with language models
International Conference on Learning Representations (ICLR), 2023
Apratim Bhattacharyya
Sunny Panchal
Mingu Lee
Reza Pourreza
Pulkit Madan
Roland Memisevic
LRM
357
12
0
30 Jun 2023
A Survey on Multimodal Large Language Models
National Science Review (NSR), 2023
Xinglong Mao
Chaoyou Fu
Zhengye Zhang
Ke Li
Xing Sun
Tong Xu
Enhong Chen
MLLM
LRM
389
943
0
23 Jun 2023
Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering
Rabiul Awal
Le Zhang
Aishwarya Agrawal
LRM
325
18
0
16 Jun 2023
Tell Me Where to Go: A Composable Framework for Context-Aware Embodied Robot Navigation
Conference on Robot Learning (CoRL), 2023
Harel Biggie
Ajay Narasimha Mopidevi
Dusty Woods
Christoffer Heckman
LM&Ro
196
13
0
15 Jun 2023
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Difei Gao
Lei Ji
Luowei Zhou
Kevin Lin
Joya Chen
Zihan Fan
Mike Zheng Shou
MLLM
331
105
0
14 Jun 2023
AVIS: Autonomous Visual Information Seeking with Large Language Model Agent
Neural Information Processing Systems (NeurIPS), 2023
Ziniu Hu
Ahmet Iscen
Chen Sun
Kai-Wei Chang
Luke Huan
David A. Ross
Cordelia Schmid
Alireza Fathi
278
12
0
13 Jun 2023
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Mu Cai
Zeyi Huang
Yuheng Li
Utkarsh Ojha
Haohan Wang
Yong Jae Lee
VLM
142
4
0
09 Jun 2023
Modular Visual Question Answering via Code Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Sanjay Subramanian
Medhini Narasimhan
Kushal Khangaonkar
Kevin Kaichuang Yang
Arsha Nagrani
Cordelia Schmid
Andy Zeng
Trevor Darrell
Dan Klein
183
60
0
08 Jun 2023
CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy Environments
AAAI Conference on Artificial Intelligence (AAAI), 2023
Xiulong Liu
Sudipta Paul
Moitreya Chatterjee
A. Cherian
183
11
0
06 Jun 2023
Towards Language-guided Interactive 3D Generation: LLMs as Layout Interpreter with Generative Feedback
Yiqi Lin
Hao Wu
Ruichen Wang
H. Lu
Xiaodong Lin
Hui Xiong
Lin Wang
3DV
155
16
0
25 May 2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Weixi Feng
Wanrong Zhu
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
Xuehai He
Sugato Basu
Xinze Wang
William Yang Wang
MLLM
426
277
0
24 May 2023
Visual Programming for Text-to-Image Generation and Evaluation
Jaemin Cho
Abhaysinh Zala
Joey Tianyi Zhou
MLLM
304
54
0
24 May 2023
IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Haoxuan You
Rui Sun
Zhecan Wang
Long Chen
Gengyu Wang
Hammad A. Ayyubi
Kai-Wei Chang
Shih-Fu Chang
VLM
MLLM
LRM
347
55
0
24 May 2023
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Cheng Qian
Chi Han
Yi R. Fung
Yujia Qin
Zhiyuan Liu
Heng Ji
LRM
286
58
0
23 May 2023
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Long Lian
Boyi Li
Adam Yala
Trevor Darrell
254
211
0
23 May 2023
Interactive Data Synthesis for Systematic Vision Adaptation via LLMs-AIGCs Collaboration
Qifan Yu
Juncheng Li
Wentao Ye
Siliang Tang
Yueting Zhuang
180
14
0
22 May 2023
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
Siyuan Huang
Zhengkai Jiang
Hao Dong
Yu Qiao
Shiyang Feng
Jiaming Song
LM&Ro
208
128
0
18 May 2023
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Shiyang Feng
Jiaming Han
Renrui Zhang
Ziyi Lin
Shijie Geng
...
Pan Lu
Conghui He
Xiangyu Yue
Jiaming Song
Yu Qiao
MLLM
242
690
0
28 Apr 2023
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Pan Lu
Baolin Peng
Hao Cheng
Michel Galley
Kai-Wei Chang
Ying Nian Wu
Song-Chun Zhu
Jianfeng Gao
KELM
MLLM
LRM
269
406
0
19 Apr 2023
Visual Instruction Tuning
Neural Information Processing Systems (NeurIPS), 2023
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
974
7,168
0
17 Apr 2023
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
Neural Information Processing Systems (NeurIPS), 2023
Yongliang Shen
Kaitao Song
Xu Tan
Dongsheng Li
Weiming Lu
Yueting Zhuang
MLLM
618
1,188
0
30 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
IEEE International Conference on Computer Vision (ICCV), 2023
Dídac Surís
Sachit Menon
Carl Vondrick
MLLM
LRM
ReLM
272
591
0
14 Mar 2023
Prismer: A Vision-Language Model with Multi-Task Experts
Shikun Liu
Linxi Fan
Edward Johns
Zhiding Yu
Chaowei Xiao
Anima Anandkumar
VLM
MLLM
288
33
0
04 Mar 2023
Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Juncheng Li
Siliang Tang
Linchao Zhu
Wenqiao Zhang
Yi Yang
Tat-Seng Chua
Fei Wu
Yueting Zhuang
BDL
189
22
0
22 Jan 2023
Previous
1
2
3
4
5
6
7
8