ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.03751
  4. Cited By
Taking a HINT: Leveraging Explanations to Make Vision and Language
  Models More Grounded
v1v2 (latest)

Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded

IEEE International Conference on Computer Vision (ICCV), 2019
11 February 2019
Ramprasaath R. Selvaraju
Stefan Lee
Yilin Shen
Hongxia Jin
Shalini Ghosh
Larry Heck
Dhruv Batra
Devi Parikh
    FAttVLM
ArXiv (abs)PDFHTML

Papers citing "Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded"

49 / 149 papers shown
Title
AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss
AdaVQA: Overcoming Language Priors with Adapted Margin Cosine LossInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
Yangyang Guo
Liqiang Nie
Zhiyong Cheng
Feng Ji
Ji Zhang
Marco Bertini
137
41
0
05 May 2021
Improved and efficient inter-vehicle distance estimation using road
  gradients of both ego and target vehicles
Improved and efficient inter-vehicle distance estimation using road gradients of both ego and target vehiclesInternational Conference on Autonomic and Autonomous Systems (ICAAS), 2021
Robik Shrestha
Jinkyu Lee
Kushal Kafle
S. Hwang
Il Yong Chun
144
1
0
01 Apr 2021
Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU
  Models
Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Mengnan Du
Varun Manjunatha
R. Jain
Ruchi Deshpande
Franck Dernoncourt
Jiuxiang Gu
Tong Sun
Helen Zhou
271
116
0
11 Mar 2021
Detecting Spurious Correlations with Sanity Tests for Artificial
  Intelligence Guided Radiology Systems
Detecting Spurious Correlations with Sanity Tests for Artificial Intelligence Guided Radiology SystemsFrontiers in Digital Health (FDH), 2021
U. Mahmood
Robik Shrestha
D. Bates
L. Mannelli
G. Corrias
Y. Erdi
Christopher Kanan
171
19
0
04 Mar 2021
EnD: Entangling and Disentangling deep representations for bias
  correction
EnD: Entangling and Disentangling deep representations for bias correctionComputer Vision and Pattern Recognition (CVPR), 2021
Enzo Tartaglione
C. Barbano
Marco Grangetto
283
137
0
02 Mar 2021
When Can Models Learn From Explanations? A Formal Framework for
  Understanding the Roles of Explanation Data
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data
Peter Hase
Joey Tianyi Zhou
XAI
397
91
0
03 Feb 2021
Answer Questions with Right Image Regions: A Visual Attention
  Regularization Approach
Answer Questions with Right Image Regions: A Visual Attention Regularization Approach
Zichen Liu
Yangyang Guo
Jianhua Yin
Xuemeng Song
Weifeng Liu
Liqiang Nie
165
34
0
03 Feb 2021
Object-Centric Diagnosis of Visual Reasoning
Object-Centric Diagnosis of Visual Reasoning
Jianwei Yang
Jiayuan Mao
Jiajun Wu
Devi Parikh
David D. Cox
J. Tenenbaum
Chuang Gan
OCL
182
17
0
21 Dec 2020
Learning content and context with language bias for Visual Question
  Answering
Learning content and context with language bias for Visual Question AnsweringIEEE International Conference on Multimedia and Expo (ICME), 2020
Chao Yang
Su Feng
Dongsheng Li
Huawei Shen
Guoqing Wang
Bin Jiang
148
24
0
21 Dec 2020
Overcoming Language Priors with Self-supervised Learning for Visual
  Question Answering
Overcoming Language Priors with Self-supervised Learning for Visual Question AnsweringInternational Joint Conference on Artificial Intelligence (IJCAI), 2020
Xi Zhu
Zhendong Mao
Chunxiao Liu
Peng Zhang
Bin Wang
Yongdong Zhang
SSL
163
132
0
17 Dec 2020
A Closer Look at the Robustness of Vision-and-Language Pre-trained
  Models
A Closer Look at the Robustness of Vision-and-Language Pre-trained Models
Linjie Li
Zhe Gan
Jingjing Liu
VLM
241
50
0
15 Dec 2020
Debiased-CAM to mitigate image perturbations with faithful visual
  explanations of machine learning
Debiased-CAM to mitigate image perturbations with faithful visual explanations of machine learningInternational Conference on Human Factors in Computing Systems (CHI), 2020
Wencan Zhang
Mariella Dimiccoli
Brian Y. Lim
FAtt
339
19
0
10 Dec 2020
CASTing Your Model: Learning to Localize Improves Self-Supervised
  Representations
CASTing Your Model: Learning to Localize Improves Self-Supervised Representations
Ramprasaath R. Selvaraju
Karan Desai
Justin Johnson
Nikhil Naik
SSL
157
84
0
08 Dec 2020
ProtoPShare: Prototype Sharing for Interpretable Image Classification
  and Similarity Discovery
ProtoPShare: Prototype Sharing for Interpretable Image Classification and Similarity DiscoveryKnowledge Discovery and Data Mining (KDD), 2020
Dawid Rymarczyk
Lukasz Struski
Jacek Tabor
Bartosz Zieliñski
203
135
0
29 Nov 2020
Right for the Right Concept: Revising Neuro-Symbolic Concepts by
  Interacting with their Explanations
Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their ExplanationsComputer Vision and Pattern Recognition (CVPR), 2020
Wolfgang Stammer
P. Schramowski
Kristian Kersting
FAtt
492
126
0
25 Nov 2020
mForms : Multimodal Form-Filling with Question Answering
mForms : Multimodal Form-Filling with Question AnsweringInternational Conference on Language Resources and Evaluation (LREC), 2020
Larry Heck
S. Heck
Anirudh S. Sundar
330
7
0
24 Nov 2020
Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a
  Class-imbalance View
Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a Class-imbalance ViewIEEE Transactions on Image Processing (TIP), 2020
Yangyang Guo
Liqiang Nie
Zhiyong Cheng
Q. Tian
Min Zhang
327
79
0
30 Oct 2020
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
  Functional Entropies
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional EntropiesNeural Information Processing Systems (NeurIPS), 2020
Itai Gat
Idan Schwartz
Alex Schwing
Tamir Hazan
238
98
0
21 Oct 2020
SOrT-ing VQA Models : Contrastive Gradient Learning for Improved
  Consistency
SOrT-ing VQA Models : Contrastive Gradient Learning for Improved ConsistencyNorth American Chapter of the Association for Computational Linguistics (NAACL), 2020
Sameer Dharur
Purva Tendulkar
Dhruv Batra
Devi Parikh
Ramprasaath R. Selvaraju
139
2
0
20 Oct 2020
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense
  Spatiotemporal Grounding
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Alexander Ku
Peter Anderson
Roma Patel
Eugene Ie
Jason Baldridge
217
408
0
15 Oct 2020
Remembering for the Right Reasons: Explanations Reduce Catastrophic
  Forgetting
Remembering for the Right Reasons: Explanations Reduce Catastrophic ForgettingInternational Conference on Learning Representations (ICLR), 2020
Sayna Ebrahimi
Suzanne Petryk
Akash Gokul
William Gan
Joseph E. Gonzalez
Marcus Rohrbach
Trevor Darrell
CLL
218
51
0
04 Oct 2020
Trustworthy Convolutional Neural Networks: A Gradient Penalized-based
  Approach
Trustworthy Convolutional Neural Networks: A Gradient Penalized-based Approach
Nicholas F Halliwell
Freddy Lecue
FAtt
203
9
0
29 Sep 2020
AiR: Attention with Reasoning Capability
AiR: Attention with Reasoning CapabilityEuropean Conference on Computer Vision (ECCV), 2020
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
146
44
0
28 Jul 2020
Comprehensive Image Captioning via Scene Graph Decomposition
Comprehensive Image Captioning via Scene Graph DecompositionEuropean Conference on Computer Vision (ECCV), 2020
Yiwu Zhong
Liwei Wang
Jianshu Chen
Dong Yu
Yin Li
223
137
0
23 Jul 2020
Reducing Language Biases in Visual Question Answering with
  Visually-Grounded Question Encoder
Reducing Language Biases in Visual Question Answering with Visually-Grounded Question EncoderEuropean Conference on Computer Vision (ECCV), 2020
K. Gouthaman
Anurag Mittal
348
88
0
13 Jul 2020
Improving VQA and its Explanations \\ by Comparing Competing
  Explanations
Improving VQA and its Explanations \\ by Comparing Competing Explanations
Jialin Wu
Liyan Chen
Raymond J. Mooney
FAttAAML
206
18
0
28 Jun 2020
Overcoming Statistical Shortcuts for Open-ended Visual Counting
Overcoming Statistical Shortcuts for Open-ended Visual Counting
Corentin Dancette
Rémi Cadène
Xinlei Chen
Matthieu Cord
199
3
0
17 Jun 2020
Estimating semantic structure for the VQA answer space
Estimating semantic structure for the VQA answer space
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
171
5
0
10 Jun 2020
Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?
Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
OOD
260
99
0
09 Jun 2020
Counterfactual VQA: A Cause-Effect Look at Language Bias
Counterfactual VQA: A Cause-Effect Look at Language Bias
Yulei Niu
Kaihua Tang
Hanwang Zhang
Zhiwu Lu
Xiansheng Hua
Ji-Rong Wen
CML
490
476
0
08 Jun 2020
Hierarchical Class-Based Curriculum Loss
Hierarchical Class-Based Curriculum Loss
Palash Goyal
Shalini Ghosh
116
9
0
05 Jun 2020
On the Value of Out-of-Distribution Testing: An Example of Goodhart's
  Law
On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law
Damien Teney
Kushal Kafle
Robik Shrestha
Ehsan Abbasnejad
Christopher Kanan
Anton Van Den Hengel
OODDOOD
226
153
0
19 May 2020
Learning What Makes a Difference from Counterfactual Examples and
  Gradient Supervision
Learning What Makes a Difference from Counterfactual Examples and Gradient SupervisionEuropean Conference on Computer Vision (ECCV), 2020
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OODSSLCML
203
125
0
20 Apr 2020
Visual Grounding Methods for VQA are Working for the Wrong Reasons!
Visual Grounding Methods for VQA are Working for the Wrong Reasons!
Robik Shrestha
Kushal Kafle
Christopher Kanan
CML
266
34
0
12 Apr 2020
Egoshots, an ego-vision life-logging dataset and semantic fidelity
  metric to evaluate diversity in image captioning models
Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning modelsInternational Conference on Learning Representations (ICLR), 2020
Pranav Agarwal
Alejandro Betancourt
V. Panagiotou
Natalia Díaz Rodríguez
EGVM
213
11
0
26 Mar 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Counterfactual Samples Synthesizing for Robust Visual Question AnsweringComputer Vision and Pattern Recognition (CVPR), 2020
Long Chen
Xin Yan
Jun Xiao
Hanwang Zhang
Shiliang Pu
Yueting Zhuang
OODAAML
351
318
0
14 Mar 2020
Explainable Deep Classification Models for Domain Generalization
Explainable Deep Classification Models for Domain Generalization
Andrea Zunino
Sarah Adel Bargal
Riccardo Volpi
M. Sameki
Jianming Zhang
Stan Sclaroff
Vittorio Murino
Kate Saenko
FAtt
168
45
0
13 Mar 2020
Cross-modal Learning for Multi-modal Video Categorization
Cross-modal Learning for Multi-modal Video Categorization
Palash Goyal
Saurabh Sahu
Shalini Ghosh
Chul Lee
250
10
0
07 Mar 2020
Exploiting Temporal Coherence for Multi-modal Video Categorization
Exploiting Temporal Coherence for Multi-modal Video Categorization
Palash Goyal
Saurabh Sahu
Shalini Ghosh
Chul Lee
130
1
0
07 Feb 2020
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions
Ramprasaath R. Selvaraju
Purva Tendulkar
Devi Parikh
Eric Horvitz
Marco Tulio Ribeiro
Besmira Nushi
Ece Kamar
LRM
152
14
0
20 Jan 2020
Making deep neural networks right for the right scientific reasons by
  interacting with their explanations
Making deep neural networks right for the right scientific reasons by interacting with their explanationsNature Machine Intelligence (NMI), 2020
P. Schramowski
Wolfgang Stammer
Stefano Teso
Anna Brugger
Xiaoting Shao
Hans-Georg Luigs
Anne-Katrin Mahlein
Kristian Kersting
564
240
0
15 Jan 2020
Explain and Improve: LRP-Inference Fine-Tuning for Image Captioning
  Models
Explain and Improve: LRP-Inference Fine-Tuning for Image Captioning ModelsInformation Fusion (Inf. Fusion), 2020
Jiamei Sun
Sebastian Lapuschkin
Wojciech Samek
Alexander Binder
FAtt
571
36
0
04 Jan 2020
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized NarrativesEuropean Conference on Computer Vision (ECCV), 2019
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
470
285
0
06 Dec 2019
Bilinear Graph Networks for Visual Question Answering
Bilinear Graph Networks for Visual Question AnsweringIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2019
Dalu Guo
Chang Xu
Dacheng Tao
GNN
173
67
0
23 Jul 2019
Learning to Generate Grounded Visual Captions without Localization
  Supervision
Learning to Generate Grounded Visual Captions without Localization Supervision
Chih-Yao Ma
Yannis Kalantidis
Ghassan AlRegib
Peter Vajda
Marcus Rohrbach
Z. Kira
SSL
369
10
0
01 Jun 2019
Self-Critical Reasoning for Robust Visual Question Answering
Self-Critical Reasoning for Robust Visual Question AnsweringNeural Information Processing Systems (NeurIPS), 2019
Jialin Wu
Raymond J. Mooney
OODNAI
221
170
0
24 May 2019
VQA with no questions-answers training
VQA with no questions-answers trainingComputer Vision and Pattern Recognition (CVPR), 2018
B. Vatashsky
S. Ullman
208
13
0
20 Nov 2018
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
1.1K
3,780
0
02 Dec 2016
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based
  Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based LocalizationInternational Journal of Computer Vision (IJCV), 2016
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
896
24,018
0
07 Oct 2016
Previous
123