ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.01993
  4. Cited By
Grounding Answers for Visual Questions Asked by Visually Impaired People

Grounding Answers for Visual Questions Asked by Visually Impaired People

4 February 2022
Chongyan Chen
Samreen Anjum
Danna Gurari
ArXivPDFHTML

Papers citing "Grounding Answers for Visual Questions Asked by Visually Impaired People"

37 / 37 papers shown
Title
RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning
RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning
Alexander Vogel
Omar Moured
Yufan Chen
Jiaming Zhang
Rainer Stiefelhagen
35
0
0
29 Mar 2025
Survey of Adversarial Robustness in Multimodal Large Language Models
Survey of Adversarial Robustness in Multimodal Large Language Models
Chengze Jiang
Zhuangzhuang Wang
Minjing Dong
Jie Gui
AAML
58
0
0
18 Mar 2025
Accounting for Focus Ambiguity in Visual Questions
Chongyan Chen
Yu-Yun Tseng
Zhuoheng Li
Anush Venkatesh
Danna Gurari
31
0
0
04 Jan 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
46
3
0
31 Dec 2024
Right this way: Can VLMs Guide Us to See More to Answer Questions?
Right this way: Can VLMs Guide Us to See More to Answer Questions?
Li Liu
Diji Yang
Sijia Zhong
Kalyana Suma Sree Tholeti
Lei Ding
Yi Zhang
Leilani H. Gilpin
31
2
0
01 Nov 2024
SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset
SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Sunil Aryal
Imran Razzak
Hakim Hacid
21
0
0
30 Oct 2024
MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of
  MLLMs
MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs
Yunqiu Xu
Linchao Zhu
Yi Yang
23
3
0
16 Oct 2024
NaVIP: An Image-Centric Indoor Navigation Solution for Visually Impaired
  People
NaVIP: An Image-Centric Indoor Navigation Solution for Visually Impaired People
Jun Yu
Yifan Zhang
Badrinadh Aila
V. Namboodiri
28
1
0
08 Oct 2024
Scene-Text Grounding for Text-Based Video Question Answering
Scene-Text Grounding for Text-Based Video Question Answering
Sheng Zhou
Junbin Xiao
Xun Yang
Peipei Song
Dan Guo
Angela Yao
Meng Wang
Tat-Seng Chua
57
1
0
22 Sep 2024
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments
Yu-Yun Tseng
Tanusree Sharma
Lotus Zhang
Abigale Stangl
Leah Findlater
Yang Wang
Danna Gurari
61
0
0
25 Jul 2024
A Survey of Attacks on Large Vision-Language Models: Resources,
  Advances, and Future Trends
A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends
Daizong Liu
Mingyu Yang
Xiaoye Qu
Pan Zhou
Yu Cheng
Wei Hu
ELM
AAML
30
24
0
10 Jul 2024
On the Role of Visual Grounding in VQA
On the Role of Visual Grounding in VQA
Daniel Reich
Tanja Schultz
18
1
0
26 Jun 2024
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors
Anindya Mondal
Sauradip Nag
Xiatian Zhu
Anjan Dutta
31
3
0
08 Mar 2024
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Yichi Zhang
Ziqiao Ma
Xiaofeng Gao
Suhaila Shakiah
Qiaozi Gao
Joyce Chai
MLLM
VLM
30
38
0
26 Feb 2024
Convincing Rationales for Visual Question Answering Reasoning
Convincing Rationales for Visual Question Answering Reasoning
Kun Li
G. Vosselman
Michael Ying Yang
34
1
0
06 Feb 2024
From Image to Language: A Critical Analysis of Visual Question Answering
  (VQA) Approaches, Challenges, and Opportunities
From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities
Md Farhan Ishmam
Md Sakib Hossain Shovon
M. F. Mridha
Nilanjan Dey
35
35
0
01 Nov 2023
Toloka Visual Question Answering Benchmark
Toloka Visual Question Answering Benchmark
Mert Pilanci
Nikita Pavlichenko
Sergey Koshelev
Daniil Likhobaba
Alisa Smirnova
16
4
0
28 Sep 2023
Sentence Attention Blocks for Answer Grounding
Sentence Attention Blocks for Answer Grounding
Seyedalireza Khoshsirat
Chandra Kambhamettu
29
7
0
20 Sep 2023
Interpretable Visual Question Answering via Reasoning Supervision
Interpretable Visual Question Answering via Reasoning Supervision
Maria Parelli
Dimitrios Mallis
Markos Diomataris
Vassilis Pitsikalis
LRM
22
2
0
07 Sep 2023
A Survey on Interpretable Cross-modal Reasoning
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
24
4
0
05 Sep 2023
Can I Trust Your Answer? Visually Grounded Video Question Answering
Can I Trust Your Answer? Visually Grounded Video Question Answering
Junbin Xiao
Angela Yao
Yicong Li
Tat-Seng Chua
25
46
0
04 Sep 2023
VQA Therapy: Exploring Answer Differences by Visually Grounding Answers
VQA Therapy: Exploring Answer Differences by Visually Grounding Answers
Chongyan Chen
Samreen Anjum
Danna Gurari
15
10
0
21 Aug 2023
An Outlook into the Future of Egocentric Vision
An Outlook into the Future of Egocentric Vision
Chiara Plizzari
Gabriele Goletto
Antonino Furnari
Siddhant Bansal
Francesco Ragusa
G. Farinella
Dima Damen
Tatiana Tommasi
EgoV
30
37
0
14 Aug 2023
LOIS: Looking Out of Instance Semantics for Visual Question Answering
LOIS: Looking Out of Instance Semantics for Visual Question Answering
Siyu Zhang
Ye Chen
Yaoru Sun
Fang Wang
Haibo Shi
Haoran Wang
20
4
0
26 Jul 2023
Dealing with Semantic Underspecification in Multimodal NLP
Dealing with Semantic Underspecification in Multimodal NLP
Sandro Pezzelle
14
9
0
08 Jun 2023
Adaptive loose optimization for robust question answering
Adaptive loose optimization for robust question answering
Jie Ma
Pinghui Wang
Ze-you Wang
Dechen Kong
Min Hu
Tingxu Han
Jun Liu
OOD
27
4
0
06 May 2023
Quality-agnostic Image Captioning to Safely Assist People with Vision
  Impairment
Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment
Lu Yu
Malvina Nikandrou
Jiali Jin
Verena Rieser
32
5
0
28 Apr 2023
Logical Implications for Visual Question Answering Consistency
Logical Implications for Visual Question Answering Consistency
Sergio Tascon-Morales
Pablo Márquez-Neila
Raphael Sznitman
13
9
0
16 Mar 2023
Toward Unsupervised Realistic Visual Question Answering
Toward Unsupervised Realistic Visual Question Answering
Yuwei Zhang
Chih-Hui Ho
Nuno Vasconcelos
CoGe
14
2
0
09 Mar 2023
Knowledge-Based Counterfactual Queries for Visual Question Answering
Knowledge-Based Counterfactual Queries for Visual Question Answering
Theodoti Stoikou
Maria Lymperaiou
Giorgos Stamou
AAML
13
1
0
05 Mar 2023
Salient Object Detection for Images Taken by People With Vision
  Impairments
Salient Object Detection for Images Taken by People With Vision Impairments
Jarek Reynolds
Chandra Kanth Nagesh
Danna Gurari
22
10
0
12 Jan 2023
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language
  Understanding
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Kenton Lee
Mandar Joshi
Iulia Turc
Hexiang Hu
Fangyu Liu
Julian Martin Eisenschlos
Urvashi Khandelwal
Peter Shaw
Ming-Wei Chang
Kristina Toutanova
CLIP
VLM
148
259
0
07 Oct 2022
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
Ahmad Darkhalil
Dandan Shan
Bin Zhu
Jian Ma
Amlan Kar
Richard E. L. Higgins
Sanja Fidler
David Fouhey
Dima Damen
VOS
37
98
0
26 Sep 2022
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual
  Impairments
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments
Yu-Yun Tseng
Alexander Bell
Danna Gurari
14
8
0
24 Jul 2022
Tell Me the Evidence? Dual Visual-Linguistic Interaction for Answer
  Grounding
Tell Me the Evidence? Dual Visual-Linguistic Interaction for Answer Grounding
Junwen Pan
Guanlin Chen
Yi Liu
Jiexiang Wang
Chengqi Bian
Pengfei Zhu
Zhicheng Zhang
8
2
0
21 Jun 2022
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Jiasen Lu
Christopher Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
ObjD
VLM
MLLM
31
391
0
17 Jun 2022
Assessing Image Quality Issues for Real-World Problems
Assessing Image Quality Issues for Real-World Problems
Tai-Yin Chiu
Yinan Zhao
Danna Gurari
49
54
0
27 Mar 2020
1