ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08390
  4. Cited By
Revisiting Visual Question Answering Baselines

Revisiting Visual Question Answering Baselines

27 June 2016
Allan Jabri
Armand Joulin
L. V. D. van der Maaten
    OOD
ArXivPDFHTML

Papers citing "Revisiting Visual Question Answering Baselines"

18 / 18 papers shown
Title
Representation Learning Preserving Ignorability and Covariate Matching for Treatment Effects
Representation Learning Preserving Ignorability and Covariate Matching for Treatment Effects
Praharsh Nanavati
Ranjitha Prasad
Karthikeyan Shanmugam
OOD
CML
66
0
0
29 Apr 2025
Building Trustworthy Multimodal AI: A Review of Fairness, Transparency, and Ethics in Vision-Language Tasks
Building Trustworthy Multimodal AI: A Review of Fairness, Transparency, and Ethics in Vision-Language Tasks
Mohammad Saleha
Azadeh Tabatabaeib
52
0
0
14 Apr 2025
Attention Mechanism based Cognition-level Scene Understanding
Attention Mechanism based Cognition-level Scene Understanding
Xuejiao Tang
Tai Le Quy
LRM
23
0
0
17 Apr 2022
Going Beneath the Surface: Evaluating Image Captioning for
  Grammaticality, Truthfulness and Diversity
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
17
9
0
19 Dec 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
A. Schwing
LRM
ReLM
26
9
0
31 Oct 2019
Invariant Risk Minimization
Invariant Risk Minimization
Martín Arjovsky
Léon Bottou
Ishaan Gulrajani
David Lopez-Paz
OOD
22
2,152
0
05 Jul 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
A. Schwing
Tamir Hazan
19
69
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
30
117
0
11 Apr 2019
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language
  Understanding
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
32
595
0
04 Oct 2018
Context-Aware Visual Policy Network for Sequence-Level Image Captioning
Context-Aware Visual Policy Network for Sequence-Level Image Captioning
Daqing Liu
Zhengjun Zha
Hanwang Zhang
Yongdong Zhang
Feng Wu
CLIP
26
103
0
16 Aug 2018
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship
  Features
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features
Xu Yang
Hanwang Zhang
Jianfei Cai
42
74
0
01 Aug 2018
Annotation Artifacts in Natural Language Inference Data
Annotation Artifacts in Natural Language Inference Data
Suchin Gururangan
Swabha Swayamdipta
Omer Levy
Roy Schwartz
Samuel R. Bowman
Noah A. Smith
33
1,157
0
06 Mar 2018
Tell-and-Answer: Towards Explainable Visual Question Answering using
  Attributes and Captions
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
Qing Li
Jianlong Fu
D. Yu
Tao Mei
Jiebo Luo
FAtt
XAI
CoGe
46
60
0
27 Jan 2018
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
27
494
0
11 Apr 2017
Visual Translation Embedding Network for Visual Relation Detection
Visual Translation Embedding Network for Visual Relation Detection
Hanwang Zhang
Zawlin Kyaw
Shih-Fu Chang
Tat-Seng Chua
ViT
140
560
0
27 Feb 2017
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
99
3,116
0
02 Dec 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
149
1,465
0
06 Jun 2016
Learning Models for Actions and Person-Object Interactions with Transfer
  to Question Answering
Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering
Arun Mallya
Svetlana Lazebnik
28
119
0
16 Apr 2016
1