ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.02356
  4. Cited By
WeaQA: Weak Supervision via Captions for Visual Question Answering
v1v2 (latest)

WeaQA: Weak Supervision via Captions for Visual Question Answering

Findings (Findings), 2020
4 December 2020
Pratyay Banerjee
Tejas Gokhale
Yezhou Yang
Chitta Baral
ArXiv (abs)PDFHTML

Papers citing "WeaQA: Weak Supervision via Captions for Visual Question Answering"

25 / 25 papers shown
SCRA-VQA: Summarized Caption-Rerank for Augmented Large Language Models in Visual Question Answering
SCRA-VQA: Summarized Caption-Rerank for Augmented Large Language Models in Visual Question Answering
Yan Zhang
Jiaqing Lin
Miao Zhang
Kui Xiao
Xiaoju Hou
Yue Zhao
Ruoyao Xiao
136
0
0
25 Sep 2025
When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs
When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs
A. S. Penamakuri
Navlika Singh
Piyush Arora
Anand Mishra
VLM
187
1
0
20 Sep 2025
SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
Kejia Chen
Jiawen Zhang
Jiacong Hu
Jiazhen Yang
Jian Lou
Zunlei Feng
Weilong Dai
389
2
0
06 Mar 2025
MedCoT: Medical Chain of Thought via Hierarchical Expert
MedCoT: Medical Chain of Thought via Hierarchical ExpertConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jiaxiang Liu
Yuan Wang
Jiawei Du
Qiufeng Wang
Zuozhu Liu
LRM
601
60
0
18 Dec 2024
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional VideosComputer Vision and Pattern Recognition (CVPR), 2024
Sagnik Majumder
Tushar Nagarajan
Ziad Al-Halah
Reina Pradhan
Kristen Grauman
506
0
0
13 Nov 2024
R-LLaVA: Improving Med-VQA Understanding through Visual Region of Interest
R-LLaVA: Improving Med-VQA Understanding through Visual Region of Interest
Xupeng Chen
Zhixin Lai
Kangrui Ruan
Shichu Chen
Jiaxiang Liu
Zuozhu Liu
871
23
0
27 Oct 2024
Multiscale Matching Driven by Cross-Modal Similarity Consistency for
  Audio-Text Retrieval
Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text RetrievalIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Qian Wang
Jia-Chen Gu
Zhen-Hua Ling
275
5
0
15 Mar 2024
CIC: A Framework for Culturally-Aware Image Captioning
CIC: A Framework for Culturally-Aware Image Captioning
Youngsik Yun
Jihie Kim
VLM
555
11
0
08 Feb 2024
Improving Zero-shot Visual Question Answering via Large Language Models
  with Reasoning Question Prompts
Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question PromptsACM Multimedia (ACM MM), 2023
Yunshi Lan
Xiang Li
Xin Liu
Yang Li
Wei Qin
Weining Qian
LRMReLM
497
41
0
15 Nov 2023
Exploring Question Decomposition for Zero-Shot VQA
Exploring Question Decomposition for Zero-Shot VQANeural Information Processing Systems (NeurIPS), 2023
Zaid Khan
B. Vijaykumar
S. Schulter
Manmohan Chandraker
Yun Fu
ReLM
261
20
0
25 Oct 2023
Tackling VQA with Pretrained Foundation Models without Further Training
Tackling VQA with Pretrained Foundation Models without Further Training
Alvin De Jun Tan
Bingquan Shen
MLLM
239
2
0
27 Sep 2023
Towards Addressing the Misalignment of Object Proposal Evaluation for
  Vision-Language Tasks via Semantic Grounding
Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic GroundingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Joshua Forster Feinglass
Yezhou Yang
239
2
0
01 Sep 2023
Weakly Supervised Visual Question Answer Generation
Weakly Supervised Visual Question Answer Generation
Charani Alampalle
Shamanthak Hegde
Soumya Jahagirdar
Shankar Gangisetty
217
0
0
11 Jun 2023
Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA
  Tasks? A: Self-Train on Unlabeled Images!
Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!Computer Vision and Pattern Recognition (CVPR), 2023
Zaid Khan
B. Vijaykumar
S. Schulter
Xiang Yu
Y. Fu
Manmohan Chandraker
VLMMLLM
340
26
0
06 Jun 2023
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language
  Models
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2022
Jiaxian Guo
Junnan Li
Dongxu Li
A. M. H. Tiong
Boyang Albert Li
Dacheng Tao
Steven C. H. Hoi
VLMMLLM
560
174
0
21 Dec 2022
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models
  with Zero Training
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero TrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
A. M. H. Tiong
Junnan Li
Boyang Albert Li
Silvio Savarese
Guosheng Lin
MLLM
321
140
0
17 Oct 2022
MaXM: Towards Multilingual Visual Question Answering
MaXM: Towards Multilingual Visual Question AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
318
8
0
12 Sep 2022
Learning to Answer Visual Questions from Web Videos
Learning to Answer Visual Questions from Web VideosIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
431
42
0
10 May 2022
All You May Need for VQA are Image Captions
All You May Need for VQA are Image CaptionsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Soravit Changpinyo
Doron Kukliansky
Idan Szpektor
Xi Chen
Nan Ding
Radu Soricut
300
85
0
04 May 2022
Improving Biomedical Information Retrieval with Neural Retrievers
Improving Biomedical Information Retrieval with Neural RetrieversAAAI Conference on Artificial Intelligence (AAAI), 2022
Man Luo
Arindam Mitra
Tejas Gokhale
Chitta Baral
314
42
0
19 Jan 2022
Zero-shot and Few-shot Learning with Knowledge Graphs: A Comprehensive
  Survey
Zero-shot and Few-shot Learning with Knowledge Graphs: A Comprehensive SurveyProceedings of the IEEE (Proc. IEEE), 2021
Jiaoyan Chen
Yuxia Geng
Zhuo Chen
Jeff Z. Pan
Yuan He
Wen Zhang
Ian Horrocks
Hua-zeng Chen
711
77
0
18 Dec 2021
Language bias in Visual Question Answering: A Survey and Taxonomy
Language bias in Visual Question Answering: A Survey and Taxonomy
Desen Yuan
267
18
0
16 Nov 2021
Unsupervised Natural Language Inference Using PHL Triplet Generation
Unsupervised Natural Language Inference Using PHL Triplet Generation
Neeraj Varshney
Pratyay Banerjee
Tejas Gokhale
Chitta Baral
403
10
0
16 Oct 2021
Weakly-Supervised Visual-Retriever-Reader for Knowledge-based Question
  Answering
Weakly-Supervised Visual-Retriever-Reader for Knowledge-based Question AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Man Luo
Yankai Zeng
Pratyay Banerjee
Chitta Baral
RALM
337
94
0
09 Sep 2021
Weakly Supervised Relative Spatial Reasoning for Visual Question
  Answering
Weakly Supervised Relative Spatial Reasoning for Visual Question Answering
Pratyay Banerjee
Tejas Gokhale
Yezhou Yang
Chitta Baral
LRM
194
19
0
04 Sep 2021
1
Page 1 of 1