Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1708.04686
Cited By
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
15 August 2017
Chuang Gan
Yandong Li
Haoxiang Li
Chen Sun
Boqing Gong
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation"
24 / 24 papers shown
Title
Scene-Text Grounding for Text-Based Video Question Answering
Sheng Zhou
Junbin Xiao
Xun Yang
Peipei Song
Dan Guo
Angela Yao
Meng Wang
Tat-Seng Chua
128
1
0
22 Sep 2024
STAR: A Benchmark for Situated Reasoning in Real-World Videos
Bo Wu
Shoubin Yu
Zhenfang Chen
Joshua B Tenenbaum
Chuang Gan
33
176
0
15 May 2024
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
Chengyang Zhao
Yikang Shen
Zhenfang Chen
Mingyu Ding
Chuang Gan
46
15
0
10 Oct 2023
A Joint Study of Phrase Grounding and Task Performance in Vision and Language Models
Noriyuki Kojima
Hadar Averbuch-Elor
Yoav Artzi
21
2
0
06 Sep 2023
VisAlign: Dataset for Measuring the Degree of Alignment between AI and Humans in Visual Perception
Jiyoung Lee
Seung Wook Kim
Seunghyun Won
Joonseok Lee
Marzyeh Ghassemi
James Thorne
Jaeseok Choi
O.-Kil Kwon
E. Choi
22
1
0
03 Aug 2023
Referring Image Matting
Jizhizi Li
Jing Zhang
Dacheng Tao
ObjD
VLM
18
22
0
10 Jun 2022
Grounding Answers for Visual Questions Asked by Visually Impaired People
Chongyan Chen
Samreen Anjum
Danna Gurari
23
50
0
04 Feb 2022
PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning
Yining Hong
Li Yi
J. Tenenbaum
Antonio Torralba
Chuang Gan
9
39
0
09 Dec 2021
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
Mingyu Ding
Zhenfang Chen
Tao Du
Ping Luo
J. Tenenbaum
Chuang Gan
VGen
PINN
OCL
24
74
0
28 Oct 2021
Multimodal Integration of Human-Like Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Philippe Muller
Dominike Thomas
Mihai Bâce
Andreas Bulling
33
16
0
27 Sep 2021
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Florian Strohm
Prajit Dhar
Andreas Bulling
29
19
0
27 Sep 2021
Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering
Jihyung Kil
Cheng Zhang
D. Xuan
Wei-Lun Chao
56
20
0
13 Sep 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLM
SyDa
23
3,828
0
28 Jul 2021
Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs
Daniel Reich
F. Putze
Tanja Schultz
22
2
0
28 Jun 2021
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
Zhenfang Chen
Jiayuan Mao
Jiajun Wu
Kwan-Yee Kenneth Wong
J. Tenenbaum
Chuang Gan
VGen
31
92
0
30 Mar 2021
On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries
Tianze Shi
Chen Zhao
Jordan L. Boyd-Graber
Hal Daumé
Lillian Lee
16
78
0
21 Oct 2020
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
28
59
0
26 Sep 2019
A Fast and Accurate One-Stage Approach to Visual Grounding
Zhengyuan Yang
Boqing Gong
Liwei Wang
Wenbing Huang
Dong Yu
Jiebo Luo
ObjD
12
360
0
18 Aug 2019
Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training
William Harvey
Michael Teng
Frank D. Wood
18
4
0
13 Jun 2019
PVSS: A Progressive Vehicle Search System for Video Surveillance Networks
Xinchen Liu
Wu Liu
Huadong Ma
Shuangqun Li
14
8
0
10 Jan 2019
Semantic Aware Attention Based Deep Object Co-segmentation
Hong Chen
Yifei Huang
Hideki Nakayama
SSeg
11
73
0
16 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
32
595
0
04 Oct 2018
Faithful Multimodal Explanation for Visual Question Answering
Jialin Wu
Raymond J. Mooney
11
90
0
08 Sep 2018
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
149
1,465
0
06 Jun 2016
1