ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.06407
  4. Cited By
Combo of Thinking and Observing for Outside-Knowledge VQA

Combo of Thinking and Observing for Outside-Knowledge VQA

10 May 2023
Q. Si
Yuchen Mo
Zheng Lin
Huishan Ji
Weiping Wang
ArXivPDFHTML

Papers citing "Combo of Thinking and Observing for Outside-Knowledge VQA"

10 / 10 papers shown
Title
OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval
OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval
Wei Yang
Jingjing Fu
R. Wang
Jinyu Wang
Lei Song
Jiang Bian
19
0
0
10 May 2025
Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines
Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines
Xinwei Long
Zhiyuan Ma
Ermo Hua
Kaiyan Zhang
Biqing Qi
Bowen Zhou
RALM
46
0
0
23 Feb 2025
GeReA: Question-Aware Prompt Captions for Knowledge-based Visual
  Question Answering
GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering
Ziyu Ma
Shutao Li
Bin Sun
Jianfei Cai
Zuxiang Long
Fuyan Ma
21
1
0
04 Feb 2024
Object Attribute Matters in Visual Question Answering
Object Attribute Matters in Visual Question Answering
Peize Li
Q. Si
Peng Fu
Zheng Lin
Yan Wang
25
0
0
20 Dec 2023
Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based
  on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating
  ASCII-Art Are Not Totally Lacking
Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating ASCII-Art Are Not Totally Lacking
David Bayani
MLLM
26
5
0
28 Jul 2023
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual
  Question Answering
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering
Q. Si
Yuanxin Liu
Zheng Lin
Peng Fu
Weiping Wang
VLM
29
1
0
26 Oct 2022
Gender and Racial Bias in Visual Question Answering Datasets
Gender and Racial Bias in Visual Question Answering Datasets
Yusuke Hirota
Yuta Nakashima
Noa Garcia
FaML
127
46
0
17 May 2022
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
169
402
0
10 Sep 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
252
157
0
02 Jan 2021
Knowledge-Routed Visual Question Reasoning: Challenges for Deep
  Representation Embedding
Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding
Qingxing Cao
Bailin Li
Xiaodan Liang
Keze Wang
Liang Lin
44
35
0
14 Dec 2020
1