Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.02356
Cited By
v1
v2 (latest)
WeaQA: Weak Supervision via Captions for Visual Question Answering
Findings (Findings), 2020
4 December 2020
Pratyay Banerjee
Tejas Gokhale
Yezhou Yang
Chitta Baral
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"WeaQA: Weak Supervision via Captions for Visual Question Answering"
25 / 25 papers shown
SCRA-VQA: Summarized Caption-Rerank for Augmented Large Language Models in Visual Question Answering
Yan Zhang
Jiaqing Lin
Miao Zhang
Kui Xiao
Xiaoju Hou
Yue Zhao
Ruoyao Xiao
136
0
0
25 Sep 2025
When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs
A. S. Penamakuri
Navlika Singh
Piyush Arora
Anand Mishra
VLM
187
1
0
20 Sep 2025
SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
Kejia Chen
Jiawen Zhang
Jiacong Hu
Jiazhen Yang
Jian Lou
Zunlei Feng
Weilong Dai
389
2
0
06 Mar 2025
MedCoT: Medical Chain of Thought via Hierarchical Expert
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jiaxiang Liu
Yuan Wang
Jiawei Du
Qiufeng Wang
Zuozhu Liu
LRM
601
60
0
18 Dec 2024
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos
Computer Vision and Pattern Recognition (CVPR), 2024
Sagnik Majumder
Tushar Nagarajan
Ziad Al-Halah
Reina Pradhan
Kristen Grauman
506
0
0
13 Nov 2024
R-LLaVA: Improving Med-VQA Understanding through Visual Region of Interest
Xupeng Chen
Zhixin Lai
Kangrui Ruan
Shichu Chen
Jiaxiang Liu
Zuozhu Liu
871
23
0
27 Oct 2024
Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text Retrieval
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Qian Wang
Jia-Chen Gu
Zhen-Hua Ling
275
5
0
15 Mar 2024
CIC: A Framework for Culturally-Aware Image Captioning
Youngsik Yun
Jihie Kim
VLM
555
11
0
08 Feb 2024
Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts
ACM Multimedia (ACM MM), 2023
Yunshi Lan
Xiang Li
Xin Liu
Yang Li
Wei Qin
Weining Qian
LRM
ReLM
497
41
0
15 Nov 2023
Exploring Question Decomposition for Zero-Shot VQA
Neural Information Processing Systems (NeurIPS), 2023
Zaid Khan
B. Vijaykumar
S. Schulter
Manmohan Chandraker
Yun Fu
ReLM
261
20
0
25 Oct 2023
Tackling VQA with Pretrained Foundation Models without Further Training
Alvin De Jun Tan
Bingquan Shen
MLLM
239
2
0
27 Sep 2023
Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Joshua Forster Feinglass
Yezhou Yang
239
2
0
01 Sep 2023
Weakly Supervised Visual Question Answer Generation
Charani Alampalle
Shamanthak Hegde
Soumya Jahagirdar
Shankar Gangisetty
217
0
0
11 Jun 2023
Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Computer Vision and Pattern Recognition (CVPR), 2023
Zaid Khan
B. Vijaykumar
S. Schulter
Xiang Yu
Y. Fu
Manmohan Chandraker
VLM
MLLM
340
26
0
06 Jun 2023
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
Computer Vision and Pattern Recognition (CVPR), 2022
Jiaxian Guo
Junnan Li
Dongxu Li
A. M. H. Tiong
Boyang Albert Li
Dacheng Tao
Steven C. H. Hoi
VLM
MLLM
560
174
0
21 Dec 2022
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
A. M. H. Tiong
Junnan Li
Boyang Albert Li
Silvio Savarese
Guosheng Lin
MLLM
321
140
0
17 Oct 2022
MaXM: Towards Multilingual Visual Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
318
8
0
12 Sep 2022
Learning to Answer Visual Questions from Web Videos
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
431
42
0
10 May 2022
All You May Need for VQA are Image Captions
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Soravit Changpinyo
Doron Kukliansky
Idan Szpektor
Xi Chen
Nan Ding
Radu Soricut
300
85
0
04 May 2022
Improving Biomedical Information Retrieval with Neural Retrievers
AAAI Conference on Artificial Intelligence (AAAI), 2022
Man Luo
Arindam Mitra
Tejas Gokhale
Chitta Baral
314
42
0
19 Jan 2022
Zero-shot and Few-shot Learning with Knowledge Graphs: A Comprehensive Survey
Proceedings of the IEEE (Proc. IEEE), 2021
Jiaoyan Chen
Yuxia Geng
Zhuo Chen
Jeff Z. Pan
Yuan He
Wen Zhang
Ian Horrocks
Hua-zeng Chen
711
77
0
18 Dec 2021
Language bias in Visual Question Answering: A Survey and Taxonomy
Desen Yuan
267
18
0
16 Nov 2021
Unsupervised Natural Language Inference Using PHL Triplet Generation
Neeraj Varshney
Pratyay Banerjee
Tejas Gokhale
Chitta Baral
403
10
0
16 Oct 2021
Weakly-Supervised Visual-Retriever-Reader for Knowledge-based Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Man Luo
Yankai Zeng
Pratyay Banerjee
Chitta Baral
RALM
337
94
0
09 Sep 2021
Weakly Supervised Relative Spatial Reasoning for Visual Question Answering
Pratyay Banerjee
Tejas Gokhale
Yezhou Yang
Chitta Baral
LRM
194
19
0
04 Sep 2021
1
Page 1 of 1