ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.03556
  4. Cited By
Human Attention in Visual Question Answering: Do Humans and Deep
  Networks Look at the Same Regions?

Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?

11 June 2016
Abhishek Das
Harsh Agrawal
C. L. Zitnick
Devi Parikh
Dhruv Batra
ArXivPDFHTML

Papers citing "Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?"

50 / 230 papers shown
Title
What Makes for a Good Saliency Map? Comparing Strategies for Evaluating Saliency Maps in Explainable AI (XAI)
What Makes for a Good Saliency Map? Comparing Strategies for Evaluating Saliency Maps in Explainable AI (XAI)
Felix Kares
Timo Speith
Hanwei Zhang
Markus Langer
FAtt
XAI
38
0
0
23 Apr 2025
Where do Large Vision-Language Models Look at when Answering Questions?
Where do Large Vision-Language Models Look at when Answering Questions?
X. Xing
Chia-Wen Kuo
Li Fuxin
Yulei Niu
Fan Chen
Ming Li
Ying Wu
Longyin Wen
Sijie Zhu
LRM
58
0
0
18 Mar 2025
A Review of Multimodal Explainable Artificial Intelligence: Past,
  Present and Future
A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future
Shilin Sun
Wenbin An
Feng Tian
Fang Nan
Qidong Liu
J. Liu
N. Shah
Ping Chen
83
2
0
18 Dec 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms
Raihan Kabir
Naznin Haque
Md. Saiful Islam
Marium-E. Jannat
CoGe
29
1
0
17 Nov 2024
Eliminating the Language Bias for Visual Question Answering with
  fine-grained Causal Intervention
Eliminating the Language Bias for Visual Question Answering with fine-grained Causal Intervention
Ying Liu
Ge Bai
Chenji Lu
Shilong Li
Zhang Zhang
Ruifang Liu
Wenbin Guo
16
0
0
14 Oct 2024
VISTA: A Visual and Textual Attention Dataset for Interpreting
  Multimodal Models
VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models
Harshit
Tolga Tasdizen
CoGe
VLM
28
1
0
06 Oct 2024
On the Role of Visual Grounding in VQA
On the Role of Visual Grounding in VQA
Daniel Reich
Tanja Schultz
21
1
0
26 Jun 2024
How Video Meetings Change Your Expression
How Video Meetings Change Your Expression
Sumit Sarin
Utkarsh Mall
Purva Tendulkar
Carl Vondrick
CVBM
26
0
0
03 Jun 2024
Faithful Attention Explainer: Verbalizing Decisions Based on
  Discriminative Features
Faithful Attention Explainer: Verbalizing Decisions Based on Discriminative Features
Yao Rong
David Scheerer
Enkelejda Kasneci
40
0
0
16 May 2024
Learning from Observer Gaze:Zero-Shot Attention Prediction Oriented by
  Human-Object Interaction Recognition
Learning from Observer Gaze:Zero-Shot Attention Prediction Oriented by Human-Object Interaction Recognition
Yuchen Zhou
Linkai Liu
Chao Gou
23
3
0
16 May 2024
Demonstration of MaskSearch: Efficiently Querying Image Masks for
  Machine Learning Workflows
Demonstration of MaskSearch: Efficiently Querying Image Masks for Machine Learning Workflows
Lindsey Linxi Wei
Chung Yik Edward Yeung
Hongjian Yu
Jingchuan Zhou
Dong He
Magdalena Balazinska
OOD
14
0
0
09 Apr 2024
A Gaze-grounded Visual Question Answering Dataset for Clarifying
  Ambiguous Japanese Questions
A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions
Shun Inadumi
Seiya Kawano
Akishige Yuguchi
Yasutomo Kawanishi
Koichiro Yoshino
28
1
0
26 Mar 2024
ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
Anthony Liang
Jesse Thomason
Erdem Biyik
24
7
0
16 Mar 2024
FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in
  Human-Centric Tasks
FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks
Muhammad Gul Zain Ali Khan
Muhammad Ferjad Naeem
F. Tombari
Luc Van Gool
Didier Stricker
Muhammad Zeshan Afzal
VLM
CLIP
33
3
0
11 Mar 2024
Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale
  Annotations
Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale Annotations
Stephanie Brandl
Oliver Eberle
Tiago F. R. Ribeiro
Anders Søgaard
Nora Hollenstein
32
1
0
29 Feb 2024
Trends, Applications, and Challenges in Human Attention Modelling
Trends, Applications, and Challenges in Human Attention Modelling
Giuseppe Cartella
Marcella Cornia
Vittorio Cuculo
Alessandro D’Amelio
Dario Zanca
Giuseppe Boccignone
Rita Cucchiara
38
6
0
28 Feb 2024
Convincing Rationales for Visual Question Answering Reasoning
Convincing Rationales for Visual Question Answering Reasoning
Kun Li
G. Vosselman
Michael Ying Yang
34
1
0
06 Feb 2024
Describing Images $\textit{Fast and Slow}$: Quantifying and Predicting
  the Variation in Human Signals during Visuo-Linguistic Processes
Describing Images Fast and Slow\textit{Fast and Slow}Fast and Slow: Quantifying and Predicting the Variation in Human Signals during Visuo-Linguistic Processes
Ece Takmaz
Sandro Pezzelle
Raquel Fernández
19
1
0
02 Feb 2024
Uncovering the Full Potential of Visual Grounding Methods in VQA
Uncovering the Full Potential of Visual Grounding Methods in VQA
Daniel Reich
Tanja Schultz
25
4
0
15 Jan 2024
Voila-A: Aligning Vision-Language Models with User's Gaze Attention
Voila-A: Aligning Vision-Language Models with User's Gaze Attention
Kun Yan
Lei Ji
Zeyu Wang
Yuntao Wang
Nan Duan
Shuai Ma
50
7
0
22 Dec 2023
Interpretability is in the eye of the beholder: Human versus artificial
  classification of image segments generated by humans versus XAI
Interpretability is in the eye of the beholder: Human versus artificial classification of image segments generated by humans versus XAI
Romy Müller
Marius Thoss
Julian Ullrich
Steffen Seitz
Carsten Knoll
14
3
0
21 Nov 2023
PHD: Pixel-Based Language Modeling of Historical Documents
PHD: Pixel-Based Language Modeling of Historical Documents
Nadav Borenstein
Phillip Rust
Desmond Elliott
Isabelle Augenstein
18
3
0
22 Oct 2023
A Joint Study of Phrase Grounding and Task Performance in Vision and
  Language Models
A Joint Study of Phrase Grounding and Task Performance in Vision and Language Models
Noriyuki Kojima
Hadar Averbuch-Elor
Yoav Artzi
21
2
0
06 Sep 2023
VQA Therapy: Exploring Answer Differences by Visually Grounding Answers
VQA Therapy: Exploring Answer Differences by Visually Grounding Answers
Chongyan Chen
Samreen Anjum
Danna Gurari
21
10
0
21 Aug 2023
Do humans and Convolutional Neural Networks attend to similar areas
  during scene classification: Effects of task and image type
Do humans and Convolutional Neural Networks attend to similar areas during scene classification: Effects of task and image type
Romy Müller
Marcel Duerschmidt
Julian Ullrich
Carsten Knoll
Sascha Weber
Steffen Seitz
HAI
13
6
0
25 Jul 2023
Robust Visual Question Answering: Datasets, Methods, and Future
  Challenges
Robust Visual Question Answering: Datasets, Methods, and Future Challenges
Jie Ma
Pinghui Wang
Dechen Kong
Zewei Wang
Jun Liu
Hongbin Pei
Junzhou Zhao
OOD
24
18
0
21 Jul 2023
Multimodal Explainable Artificial Intelligence: A Comprehensive Review
  of Methodological Advances and Future Research Directions
Multimodal Explainable Artificial Intelligence: A Comprehensive Review of Methodological Advances and Future Research Directions
N. Rodis
Christos Sardianos
Panagiotis I. Radoglou-Grammatikis
Panagiotis G. Sarigiannidis
Iraklis Varlamis
Georgios Th. Papadopoulos
20
22
0
09 Jun 2023
Unveiling Cross Modality Bias in Visual Question Answering: A Causal
  View with Possible Worlds VQA
Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA
A. Vosoughi
Shijian Deng
Songyang Zhang
Yapeng Tian
Chenliang Xu
Jiebo Luo
CML
32
3
0
31 May 2023
HaVQA: A Dataset for Visual Question Answering and Multimodal Research
  in Hausa Language
HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language
Shantipriya Parida
Idris Abdulmumin
Shamsuddeen Hassan Muhammad
Aneesh Bose
Guneet Singh Kohli
I. Ahmad
Ketan Kotwal
S. Sarkar
Ondrej Bojar
Habeebah Adamu Kakudi
22
4
0
28 May 2023
Measuring Faithful and Plausible Visual Grounding in VQA
Measuring Faithful and Plausible Visual Grounding in VQA
Daniel Reich
F. Putze
Tanja Schultz
18
5
0
24 May 2023
Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual
  Question Answering
Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering
Xingyu Fu
Ben Zhou
Sihao Chen
Mark Yatskar
Dan Roth
LRM
26
0
0
24 May 2023
Visual Question Answering: A Survey on Techniques and Common Trends in
  Recent Literature
Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature
Ana Claudia Akemi Matsuki de Faria
Felype de Castro Bastos
Jose Victor Nogueira Alves da Silva
Vitor Lopes Fabris
Valeska Uchôa
Décio Gonccalves de Aguiar Neto
C. F. G. Santos
30
22
0
18 May 2023
MaskSearch: Querying Image Masks at Scale
MaskSearch: Querying Image Masks at Scale
Dong He
Jieyu Zhang
Maureen Daum
Alexander Ratner
Magdalena Balazinska
VLM
19
2
0
03 May 2023
Top-Down Visual Attention from Analysis by Synthesis
Top-Down Visual Attention from Analysis by Synthesis
Baifeng Shi
Trevor Darrell
Xin Eric Wang
17
28
0
23 Mar 2023
Towards Trustable Skin Cancer Diagnosis via Rewriting Model's Decision
Towards Trustable Skin Cancer Diagnosis via Rewriting Model's Decision
Siyuan Yan
Zhen Yu
Xuelin Zhang
Dwarikanath Mahapatra
Shekhar S. Chandra
Monika Janda
Peter Soyer
Z. Ge
13
25
0
02 Mar 2023
On The Coherence of Quantitative Evaluation of Visual Explanations
On The Coherence of Quantitative Evaluation of Visual Explanations
Benjamin Vandersmissen
José Oramas
XAI
FAtt
21
3
0
14 Feb 2023
Learning to Agree on Vision Attention for Visual Commonsense Reasoning
Learning to Agree on Vision Attention for Visual Commonsense Reasoning
Zhenyang Li
Yangyang Guo
Ke-Jyun Wang
Fan Liu
Liqiang Nie
Mohan S. Kankanhalli
32
10
0
04 Feb 2023
Going Beyond XAI: A Systematic Survey for Explanation-Guided Learning
Going Beyond XAI: A Systematic Survey for Explanation-Guided Learning
Yuyang Gao
Siyi Gu
Junji Jiang
S. Hong
Dazhou Yu
Liang Zhao
24
39
0
07 Dec 2022
Attribution-based XAI Methods in Computer Vision: A Review
Attribution-based XAI Methods in Computer Vision: A Review
Kumar Abhishek
Deeksha Kamath
25
16
0
27 Nov 2022
Simulating Human Gaze with Neural Visual Attention
Simulating Human Gaze with Neural Visual Attention
Leo Schwinn
Doina Precup
Bjoern M. Eskofier
Dario Zanca
14
1
0
22 Nov 2022
Prophet Attention: Predicting Attention with Future Attention for Image
  Captioning
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
19
46
0
19 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
33
16
0
05 Oct 2022
Visual Perturbation-aware Collaborative Learning for Overcoming the
  Language Prior Problem
Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem
Yudong Han
Liqiang Nie
Jianhua Yin
Jianlong Wu
Yan Yan
24
12
0
24 Jul 2022
Is the U-Net Directional-Relationship Aware?
Is the U-Net Directional-Relationship Aware?
M. Riva
Pietro Gori
Florian Yger
Isabelle Bloch
11
1
0
06 Jul 2022
Enabling Harmonious Human-Machine Interaction with Visual-Context
  Augmented Dialogue System: A Review
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang
Bin Guo
Y. Zeng
Yasan Ding
Chen Qiu
Ying Zhang
Li Yao
Zhiwen Yu
25
2
0
02 Jul 2022
RES: A Robust Framework for Guiding Visual Explanation
RES: A Robust Framework for Guiding Visual Explanation
Yuyang Gao
Tong Sun
Guangji Bai
Siyi Gu
S. Hong
Liang Zhao
FAtt
AAML
XAI
21
32
0
27 Jun 2022
VisFIS: Visual Feature Importance Supervision with
  Right-for-the-Right-Reason Objectives
VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives
Zhuofan Ying
Peter Hase
Mohit Bansal
LRM
23
13
0
22 Jun 2022
Guiding Visual Question Answering with Attention Priors
Guiding Visual Question Answering with Attention Priors
T. Le
Vuong Le
Sunil R. Gupta
Svetha Venkatesh
T. Tran
22
6
0
25 May 2022
Do Transformer Models Show Similar Attention Patterns to Task-Specific
  Human Gaze?
Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?
Stephanie Brandl
Oliver Eberle
Jonas Pilot
Anders Søgaard
65
33
0
25 Apr 2022
Attention in Reasoning: Dataset, Analysis, and Modeling
Attention in Reasoning: Dataset, Analysis, and Modeling
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
22
3
0
20 Apr 2022
12345
Next