Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.02632
Cited By
Question-Guided Hybrid Convolution for Visual Question Answering
8 August 2018
Peng Gao
Pan Lu
Hongsheng Li
Shuang Li
Yikang Li
S. Hoi
Xiaogang Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Question-Guided Hybrid Convolution for Visual Question Answering"
16 / 16 papers shown
Title
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
209
1,105
0
20 Sep 2022
DM
2
^2
2
S
2
^2
2
: Deep Multi-Modal Sequence Sets with Hierarchical Modality Attention
Shunsuke Kitada
Yuki Iwazaki
Riku Togashi
Hitoshi Iyatomi
13
1
0
07 Sep 2022
Recent, rapid advancement in visual question answering architecture: a review
V. Kodali
Daniel Berleant
27
9
0
02 Mar 2022
Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering
Jianjian Cao
Xiameng Qin
Sanyuan Zhao
Jianbing Shen
23
20
0
14 Dec 2021
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning
Pan Lu
Liang Qiu
Jiaqi Chen
Tony Xia
Yizhou Zhao
Wei Zhang
Zhou Yu
Xiaodan Liang
Song-Chun Zhu
AIMat
28
183
0
25 Oct 2021
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Peng Gao
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Yongfeng Zhang
Hongsheng Li
A. Cherian
19
11
0
08 Jul 2020
Character Matters: Video Story Understanding with Character-Aware Relations
Shijie Geng
Ji Zhang
Zuohui Fu
Peng Gao
Hang Zhang
Gerard de Melo
18
11
0
09 May 2020
Learning Depth-Guided Convolutions for Monocular 3D Object Detection
Mingyu Ding
Yuqi Huo
Hongwei Yi
Zhe Wang
Jianping Shi
Zhiwu Lu
Ping Luo
3DPC
17
311
0
10 Dec 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
X. Jiang
J. Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
15
70
0
17 Nov 2019
Cross Attention Network for Few-shot Classification
Rui Hou
Hong Chang
Bingpeng Ma
Shiguang Shan
Xilin Chen
202
629
0
17 Oct 2019
Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing
Xihui Liu
Zihao W. Wang
Jing Shao
Xiaogang Wang
Hongsheng Li
ObjD
8
180
0
03 Mar 2019
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Peng Gao
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Hongsheng Li
AIMat
16
362
0
13 Dec 2018
PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition
Haoxuan You
Yifan Feng
R. Ji
Yue Gao
3DPC
34
169
0
23 Aug 2018
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu
Lei Ji
Wei Zhang
Nan Duan
M. Zhou
Jianyong Wang
CoGe
9
79
0
24 May 2018
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,464
0
06 Jun 2016
Learning Deep Representations of Fine-grained Visual Descriptions
Scott E. Reed
Zeynep Akata
Bernt Schiele
Honglak Lee
OCL
VLM
163
840
0
17 May 2016
1