Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.06594
Cited By
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
12 March 2023
Deyao Zhu
Jun Chen
Kilichbek Haydarov
Xiaoqian Shen
Wenxuan Zhang
Mohamed Elhoseiny
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions"
15 / 15 papers shown
Title
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning
Zhecan Wang
Garrett Bingham
Adams Wei Yu
Quoc V. Le
Thang Luong
Golnaz Ghiasi
MLLM
LRM
35
9
0
22 Jul 2024
A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence
Ali Kashefi
VGen
43
5
0
24 May 2024
Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis
Vishnu Sashank Dorbala
Sanjoy Chowdhury
Dinesh Manocha
LM&Ro
25
0
0
18 Mar 2024
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models
Xiujie Song
Mengyue Wu
Ke Zhu
Chunhao Zhang
Yanyi Chen
LRM
ELM
31
3
0
28 Feb 2024
CIC: A Framework for Culturally-Aware Image Captioning
Youngsik Yun
Jihie Kim
VLM
17
5
0
08 Feb 2024
Beyond Segmentation: Road Network Generation with Multi-Modal LLMs
Sumedh Rasal
Sanjay K. Boddhu
22
5
0
15 Oct 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
F. Khan
VLM
18
117
0
25 Jul 2023
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Gengze Zhou
Yicong Hong
Qi Wu
ELM
LM&Ro
LLMAG
LRM
23
139
0
26 May 2023
IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models
Haoxuan You
Rui Sun
Zhecan Wang
Long Chen
Gengyu Wang
Hammad A. Ayyubi
Kai-Wei Chang
Shih-Fu Chang
VLM
MLLM
LRM
39
43
0
24 May 2023
IC3: Image Captioning by Committee Consensus
David M. Chan
Austin Myers
Sudheendra Vijayanarasimhan
David A. Ross
John F. Canny
19
17
0
02 Feb 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
293
4,077
0
24 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
390
4,124
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,448
0
28 Jan 2022
Guiding Visual Question Generation
Nihir Vedd
Zixu Wang
Marek Rei
Yishu Miao
Lucia Specia
81
23
0
15 Oct 2021
1