Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.13503
Cited By
InfoVisDial: An Informative Visual Dialogue Dataset by Bridging Large Multimodal and Language Models
21 December 2023
Bingbing Wen
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Bill Howe
Lijuan Wang
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"InfoVisDial: An Informative Visual Dialogue Dataset by Bridging Large Multimodal and Language Models"
5 / 5 papers shown
Title
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Zhenhailong Wang
Manling Li
Ruochen Xu
Luowei Zhou
Jie Lei
...
Chenguang Zhu
Derek Hoiem
Shih-Fu Chang
Mohit Bansal
Heng Ji
MLLM
VLM
159
134
0
22 May 2022
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
161
401
0
10 Sep 2021
Guided Generation of Cause and Effect
Zhongyang Li
Xiao Ding
Ting Liu
J. E. Hu
Benjamin Van Durme
155
79
0
21 Jul 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
845
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
2,875
0
11 Feb 2021
1