ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01883
  4. Cited By
All You May Need for VQA are Image Captions

All You May Need for VQA are Image Captions

North American Chapter of the Association for Computational Linguistics (NAACL), 2022
4 May 2022
Soravit Changpinyo
Doron Kukliansky
Idan Szpektor
Xi Chen
Nan Ding
Radu Soricut
ArXiv (abs)PDFHTML

Papers citing "All You May Need for VQA are Image Captions"

6 / 56 papers shown
Title
Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval
Modal-specific Pseudo Query Generation for Video Corpus Moment RetrievalConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Minjoon Jung
Seongho Choi
Joo-Kyung Kim
Jin-Hwa Kim
Byoung-Tak Zhang
178
11
0
23 Oct 2022
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models
  with Zero Training
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero TrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
A. M. H. Tiong
Junnan Li
Boyang Albert Li
Silvio Savarese
Guosheng Lin
MLLM
217
126
0
17 Oct 2022
SQA3D: Situated Question Answering in 3D Scenes
SQA3D: Situated Question Answering in 3D ScenesInternational Conference on Learning Representations (ICLR), 2022
Xiaojian Ma
Silong Yong
Zilong Zheng
Qing Li
Yitao Liang
Song-Chun Zhu
Siyuan Huang
LM&Ro
382
234
0
14 Oct 2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model
PaLI: A Jointly-Scaled Multilingual Language-Image ModelInternational Conference on Learning Representations (ICLR), 2022
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLMVLM
614
890
0
14 Sep 2022
MaXM: Towards Multilingual Visual Question Answering
MaXM: Towards Multilingual Visual Question AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
205
8
0
12 Sep 2022
PACTran: PAC-Bayesian Metrics for Estimating the Transferability of
  Pretrained Models to Classification Tasks
PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification TasksEuropean Conference on Computer Vision (ECCV), 2022
Nan Ding
Xi Chen
Tomer Levinboim
Soravit Changpinyo
Radu Soricut
177
34
0
10 Mar 2022
Previous
12