Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2211.11559
Cited By
Visual Programming: Compositional visual reasoning without training
Computer Vision and Pattern Recognition (CVPR), 2022
18 November 2022
Tanmay Gupta
Aniruddha Kembhavi
ReLM
VLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Visual Programming: Compositional visual reasoning without training"
50 / 379 papers shown
Title
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing
Wei-Ge Chen
Irina Spiridonova
Jianwei Yang
Jianfeng Gao
Chun-yue Li
MLLM
VLM
154
46
0
01 Nov 2023
ControlLLM: Augment Language Models with Tools by Searching on Graphs
European Conference on Computer Vision (ECCV), 2023
Zhaoyang Liu
Zeqiang Lai
Zhangwei Gao
Erfei Cui
Ziheng Li
...
Lewei Lu
Qifeng Chen
Yu Qiao
Jifeng Dai
Wenhai Wang
MLLM
325
55
0
26 Oct 2023
Symbolic Planning and Code Generation for Grounded Dialogue
Justin T. Chiu
Wenting Zhao
Derek Chen
Saujas Vaduguru
Alexander M. Rush
Daniel Fried
LLMAG
118
10
0
26 Oct 2023
Graph Agent: Explicit Reasoning Agent for Graphs
Qinyong Wang
Zhenxiang Gao
Rong Xu
AI4CE
112
10
0
25 Oct 2023
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Science China Information Sciences (Sci China Inf Sci), 2023
Xinglong Mao
Chaoyou Fu
Zhengye Zhang
Tong Xu
Hao Wang
Dianbo Sui
Chunjiang Ge
Ke Li
Xingguo Sun
Enhong Chen
VLM
MLLM
270
192
0
24 Oct 2023
WebWISE: Web Interface Control and Sequential Exploration with Large Language Models
Heyi Tao
TV Sethuraman
Michal Shlapentokh-Rothman
Derek Hoiem
LLMAG
269
8
0
24 Oct 2023
What's Left? Concept Grounding with Logic-Enhanced Foundation Models
Neural Information Processing Systems (NeurIPS), 2023
Joy Hsu
Jiayuan Mao
Joshua B. Tenenbaum
Jiajun Wu
VLM
ReLM
LRM
356
37
0
24 Oct 2023
Towards Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal LLMs
Jiarui Zhang
Mahyar Khayatkhoei
P. Chhikara
Filip Ilievski
213
5
0
24 Oct 2023
Large Language Models are Visual Reasoning Coordinators
Neural Information Processing Systems (NeurIPS), 2023
Liangyu Chen
Bo Li
Sheng Shen
Jingkang Yang
Chunyuan Li
Kurt Keutzer
Trevor Darrell
Ziwei Liu
VLM
LRM
253
88
0
23 Oct 2023
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Swarnadeep Saha
Omer Levy
Asli Celikyilmaz
Mohit Bansal
Jason Weston
Xian Li
MoMe
302
99
0
23 Oct 2023
API-Assisted Code Generation for Question Answering on Varied Table Structures
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yihan Cao
Shuyi Chen
Ryan Liu
Zhiruo Wang
Daniel Fried
LMTD
188
24
0
23 Oct 2023
MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model
Le Zhang
Yihong Wu
Fengran Mo
Jian-Yun Nie
Aishwarya Agrawal
MLLM
RALM
207
7
0
20 Oct 2023
Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds
Sipeng Zheng
Jiazheng Liu
Yicheng Feng
Zongqing Lu
239
45
0
20 Oct 2023
Frozen Transformers in Language Models Are Effective Visual Encoder Layers
Ziqi Pang
Ziyang Xie
Yunze Man
Yu-Xiong Wang
357
46
0
19 Oct 2023
Neurosymbolic Grounding for Compositional World Models
Atharva Sehgal
Arya Grayeli
Jennifer J. Sun
Swarat Chaudhuri
306
11
0
19 Oct 2023
Octopus: Embodied Vision-Language Programmer from Environmental Feedback
European Conference on Computer Vision (ECCV), 2023
Jingkang Yang
Yuhao Dong
Shuai Liu
Yue Liu
Ziyue Wang
...
Haoran Tan
Jiamu Kang
Yuanhan Zhang
Kaiyang Zhou
Ziwei Liu
LM&Ro
245
77
0
12 Oct 2023
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Zhengyuan Yang
Jianfeng Wang
Linjie Li
Kevin Qinghong Lin
Chung-Ching Lin
Zicheng Liu
Lijuan Wang
LRM
MLLM
DiffM
93
28
0
12 Oct 2023
Towards Robust Multi-Modal Reasoning via Model Selection
International Conference on Learning Representations (ICLR), 2023
Xiangyan Liu
Rongxue Li
Wei Ji
Tao Lin
LLMAG
LRM
245
8
0
12 Oct 2023
OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Jie An
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
Zicheng Liu
Lijuan Wang
Jiebo Luo
206
14
0
11 Oct 2023
Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding
Kexun Zhang
Hongqiao Chen
Lei Li
Wenjie Wang
252
7
0
10 Oct 2023
Large Language Models can Learn Rules
Zhaocheng Zhu
Yuan Xue
Xinyun Chen
Denny Zhou
Jian Tang
Dale Schuurmans
Hanjun Dai
LRM
ReLM
269
85
0
10 Oct 2023
What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models
Computer Vision and Pattern Recognition (CVPR), 2023
Letian Zhang
Xiaotong Zhai
Zhongkai Zhao
Yongshuo Zong
Xin Wen
Bingchen Zhao
LRM
148
7
0
10 Oct 2023
ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
KAI-QING Zhou
Kwonjoon Lee
Teruhisa Misu
Xin Eric Wang
LRM
245
8
0
09 Oct 2023
InstructDET: Diversifying Referring Object Detection with Generalized Instructions
International Conference on Learning Representations (ICLR), 2023
Ronghao Dang
Jiangyan Feng
Haodong Zhang
Chongjian Ge
Lin Song
...
Chengju Liu
Qi Chen
Feng Zhu
Rui Zhao
Yibing Song
ObjD
359
16
0
08 Oct 2023
Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yiren Jian
Tingkai Liu
Yunzhe Tao
Chunhui Zhang
Soroush Vosoughi
HX Yang
VLM
285
20
0
05 Oct 2023
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning
International Conference on Learning Representations (ICLR), 2023
Murong Yue
Jie Zhao
Min Zhang
Liang Du
Ziyu Yao
LRM
291
114
0
04 Oct 2023
GRID: A Platform for General Robot Intelligence Development
Sai H. Vemprala
Shuhang Chen
Abhinav Shukla
Dinesh Narayanan
Ashish Kapoor
227
11
0
02 Oct 2023
Guiding Instruction-based Image Editing via Multimodal Large Language Models
International Conference on Learning Representations (ICLR), 2023
Johannes Frey
Wenze Hu
Xianzhi Du
William Yang Wang
Yinfei Yang
Zhe Gan
231
136
0
29 Sep 2023
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Han Lin
Abhaysinh Zala
Jaemin Cho
Joey Tianyi Zhou
LM&Ro
VGen
DiffM
393
109
0
26 Sep 2023
DreamLLM: Synergistic Multimodal Comprehension and Creation
International Conference on Learning Representations (ICLR), 2023
Runpei Dong
Chunrui Han
Yuang Peng
Zekun Qi
Zheng Ge
...
Hao-Ran Wei
Xiangwen Kong
Xiangyu Zhang
Kaisheng Ma
Li Yi
MLLM
242
271
0
20 Sep 2023
PolicyGPT: Automated Analysis of Privacy Policies with Large Language Models
Chenhao Tang
Zheng Liu
Chong Ma
Zihao Wu
Yiwei Li
...
Dajiang Zhu
Shijie Zhao
Xiang Li
Tianming Liu
Lei Fan
AILaw
111
26
0
19 Sep 2023
Language Models as Black-Box Optimizers for Vision-Language Models
Computer Vision and Pattern Recognition (CVPR), 2023
Shihong Liu
Zhiqiu Lin
Samuel Yu
Ryan Lee
Tiffany Ling
Deepak Pathak
Deva Ramanan
VLM
317
39
0
12 Sep 2023
Hypothesis Search: Inductive Reasoning with Language Models
International Conference on Learning Representations (ICLR), 2023
Ruocheng Wang
E. Zelikman
Gabriel Poesia
Yewen Pu
Nick Haber
Noah D. Goodman
ReLM
LRM
354
150
0
11 Sep 2023
Compositional Learning of Visually-Grounded Concepts Using Reinforcement
Zijun Lin
Haidi Azaman
M Ganesh Kumar
Cheston Tan
CoGe
OffRL
276
3
0
08 Sep 2023
Language Prompt for Autonomous Driving
AAAI Conference on Artificial Intelligence (AAAI), 2023
Dongming Wu
Wencheng Han
Tiancai Wang
Yingfei Liu
Cheng-zhong Xu
Jianbing Shen
Jianbing Shen
VLM
377
121
0
08 Sep 2023
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
353
5
0
05 Sep 2023
PointLLM: Empowering Large Language Models to Understand Point Clouds
European Conference on Computer Vision (ECCV), 2023
Runsen Xu
Xiaolong Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
Dahua Lin
MLLM
320
277
0
31 Aug 2023
BeeFlow: Behavior Tree-based Serverless Workflow Modeling and Scheduling for Resource-Constrained Edge Clusters
Journal of systems architecture (JSA), 2023
Ke Luo
Ouyang Tao
Zhi Zhou
Xu Chen
139
2
0
31 Aug 2023
Enhancing Subtask Performance of Multi-modal Large Language Model
Yongqiang Zhao
Zhenyu Li
Feng Zhang
Xinhai Xu
Donghong Liu
LRM
73
1
0
31 Aug 2023
Rational Decision-Making Agent with Internalized Utility Judgment
Yining Ye
Xin Cong
Shizuo Tian
Yujia Qin
Chong Liu
Y. Lin
Zhiyuan Liu
Maosong Sun
LLMAG
222
10
0
24 Aug 2023
Tree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoning
Pengbo Hu
Jingxian Qi
Xingyu Li
Hong Li
Xinqi Wang
Bing Quan
Ruiyu Wang
Yi Zhou
LRM
LLMAG
174
19
0
18 Aug 2023
Link-Context Learning for Multimodal LLMs
Computer Vision and Pattern Recognition (CVPR), 2023
Yan Tai
Weichen Fan
Zhao Zhang
Feng Zhu
Rui Zhao
Ziwei Liu
ReLM
LRM
127
25
0
15 Aug 2023
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
L. Schimdt
VLM
371
97
0
12 Aug 2023
NEOLAF, an LLM-powered neural-symbolic cognitive architecture
Richard Tong
Cassie Chen Cao
Timothy Xueqian Lee
Guodong Zhao
Ray Wan
...
Xiangen Hu
Robin Schmucker
Jinsheng Pan
Julian Quevedo
Yu Lu
105
1
0
08 Aug 2023
TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage
Jingqing Ruan
Yihong Chen
Bin Zhang
Zhiwei Xu
Tianpeng Bao
...
Shiwei Shi
Hangyu Mao
Ziyue Li
Xingyu Zeng
Rui Zhao
LLMAG
LM&Ro
269
49
0
07 Aug 2023
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
Cheng-Yu Hsieh
Sibei Chen
Chun-Liang Li
Yasuhisa Fujii
Alexander Ratner
Chen-Yu Lee
Ranjay Krishna
Tomas Pfister
LLMAG
SyDa
260
58
0
01 Aug 2023
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
International Conference on Learning Representations (ICLR), 2023
Yujia Qin
Shi Liang
Yining Ye
Kunlun Zhu
Lan Yan
...
Jie Zhou
Mark B. Gerstein
Dahai Li
Zhiyuan Liu
Maosong Sun
CLL
ALM
LLMAG
ELM
LM&MA
545
1,055
0
31 Jul 2023
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
International Conference on Learning Representations (ICLR), 2023
Qi Zhao
Shijie Wang
Ce Zhang
Changcheng Fu
Minh Quan Do
Nakul Agarwal
Kwonjoon Lee
Chen Sun
LM&Ro
315
79
0
31 Jul 2023
TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation
IEEE International Conference on Computer Vision (ICCV), 2023
Moon Ye-Bin
Jisoo Kim
Hong-Kyu Kim
Kilho Son
Tae-Hyun Oh
253
11
0
27 Jul 2023
WavJourney: Compositional Audio Creation with Large Language Models
IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023
Xubo Liu
Zhongkai Zhu
Haohe Liu
Yiitan Yuan
Meng Cui
...
Jinhua Liang
Yin Cao
Qiuqiang Kong
Mark D. Plumbley
Wenwu Wang
AuLLM
216
35
0
26 Jul 2023
Previous
1
2
3
4
5
6
7
8
Next