Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1808.02632
Cited By
Question-Guided Hybrid Convolution for Visual Question Answering
8 August 2018
Shiyang Feng
Pan Lu
Jiaming Song
Shuang Li
Yikang Li
Guosheng Lin
Xiaogang Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Question-Guided Hybrid Convolution for Visual Question Answering"
33 / 33 papers shown
Title
Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Yunshi Lan
Xinyuan Li
Hanyue Du
Xuesong Lu
Ming Gao
Weining Qian
Aoying Zhou
371
12
0
15 Jan 2024
From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities
Information Fusion (Inf. Fusion), 2023
Md Farhan Ishmam
Md Sakib Hossain Shovon
M. F. Mridha
Nilanjan Dey
360
68
0
01 Nov 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Neural Information Processing Systems (NeurIPS), 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
514
1,808
0
20 Sep 2022
DM
2
^2
2
S
2
^2
2
: Deep Multi-Modal Sequence Sets with Hierarchical Modality Attention
IEEE Access (IEEE Access), 2022
Shunsuke Kitada
Yuki Iwazaki
Riku Togashi
Hitoshi Iyatomi
253
2
0
07 Sep 2022
Recent, rapid advancement in visual question answering architecture: a review
IEEE International Conference on Electro/Information Technology (EIT), 2022
V. Kodali
Daniel Berleant
254
9
0
02 Mar 2022
Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering
Jianjian Cao
Xiameng Qin
Sanyuan Zhao
Jianbing Shen
160
27
0
14 Dec 2021
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning
Pan Lu
Liang Qiu
Jiaqi Chen
Tony Xia
Yizhou Zhao
Wei Zhang
Zhou Yu
Xiaodan Liang
Song-Chun Zhu
AIMat
344
254
0
25 Oct 2021
Towards Language-guided Visual Recognition via Dynamic Convolutions
Gen Luo
Weihao Ye
Xiaoshuai Sun
Yongjian Wu
Yue Gao
Rongrong Ji
ObjD
200
25
0
17 Oct 2021
Fast Convergence of DETR with Spatially Modulated Co-Attention
IEEE International Conference on Computer Vision (ICCV), 2021
Shiyang Feng
Minghang Zheng
Xiaogang Wang
Jifeng Dai
Jiaming Song
ViT
213
360
0
05 Aug 2021
Dynamic Neural Networks: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yizeng Han
Gao Huang
Shiji Song
Le Yang
Honghui Wang
Yulin Wang
3DH
AI4TS
AI4CE
371
786
0
09 Feb 2021
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Neurocomputing (Neurocomputing), 2020
Wei Chen
Weiping Wang
Tianpeng Liu
M. Lew
VLM
297
35
0
16 Oct 2020
Contrastive Visual-Linguistic Pretraining
Lei Shi
Kai Shuang
Shijie Geng
Peng Su
Zhengkai Jiang
Shiyang Feng
Zuohui Fu
Gerard de Melo
Sen Su
VLM
SSL
CLIP
144
29
0
26 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Shiyang Feng
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Zelong Li
Jiaming Song
A. Cherian
202
11
0
08 Jul 2020
Location Sensitive Image Retrieval and Tagging
Raul Gomez
J. Gibert
Lluís Gómez
Dimosthenis Karatzas
211
4
0
07 Jul 2020
Extreme Low-Light Imaging with Multi-granulation Cooperative Networks
Keqi Wang
Shiyang Feng
Guosheng Lin
Qian Guo
Y. Qian
127
4
0
16 May 2020
Character Matters: Video Story Understanding with Character-Aware Relations
Shijie Geng
Ji Zhang
Zuohui Fu
Shiyang Feng
Hang Zhang
Gerard de Melo
172
11
0
09 May 2020
Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filters
.Ilker Kesen
Ozan Arkan Can
Erkut Erdem
Aykut Erdem
Deniz Yuret
VLM
145
2
0
28 Mar 2020
Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Lei Shi
Shijie Geng
Kai Shuang
Chiori Hori
Songxiang Liu
Shiyang Feng
Sen Su
229
12
0
03 Jan 2020
Learning Depth-Guided Convolutions for Monocular 3D Object Detection
Mingyu Ding
Yuqi Huo
Hongwei Yi
Zhe Wang
Jianping Shi
Zhiwu Lu
Ping Luo
3DPC
211
351
0
10 Dec 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
AAAI Conference on Artificial Intelligence (AAAI), 2019
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
159
70
0
17 Nov 2019
Cross Attention Network for Few-shot Classification
Neural Information Processing Systems (NeurIPS), 2019
Rui Hou
Hong Chang
Bingpeng Ma
Shiguang Shan
Xilin Chen
406
736
0
17 Oct 2019
Exploring Hate Speech Detection in Multimodal Publications
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2019
Raul Gomez
J. Gibert
Lluís Gómez
Dimosthenis Karatzas
128
264
0
09 Oct 2019
Multi-modality Latent Interaction Network for Visual Question Answering
IEEE International Conference on Computer Vision (ICCV), 2019
Shiyang Feng
Haoxuan You
Zhanpeng Zhang
Xiaogang Wang
Jiaming Song
139
85
0
10 Aug 2019
Language-Conditioned Graph Networks for Relational Reasoning
IEEE International Conference on Computer Vision (ICCV), 2019
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
170
182
0
10 May 2019
Question Guided Modular Routing Networks for Visual Question Answering
Yanze Wu
Qiang Sun
Jianqi Ma
Bin Li
Yanwei Fu
Yao Peng
Xiangyang Xue
201
2
0
17 Apr 2019
Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing
Computer Vision and Pattern Recognition (CVPR), 2019
Xihui Liu
Zihao Wang
Jing Shao
Xiaogang Wang
Jiaming Song
ObjD
252
211
0
03 Mar 2019
FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction
Shuyang Sun
Jiangmiao Pang
Jianping Shi
Shuai Yi
Wanli Ouyang
195
102
0
11 Jan 2019
A^2-Net: Molecular Structure Estimation from Cryo-EM Density Volumes
Kui Xu
Zhe Wang
Jianping Shi
Jiaming Song
Q. Zhang
3DV
148
48
0
03 Jan 2019
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Shiyang Feng
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Jiaming Song
AIMat
408
393
0
13 Dec 2018
PVRNet: Point-View Relation Neural Network for 3D Shape Recognition
Haoxuan You
Yifan Feng
Xibin Zhao
C. Zou
Rongrong Ji
Yue Gao
3DPC
118
72
0
02 Dec 2018
VQA with no questions-answers training
Computer Vision and Pattern Recognition (CVPR), 2018
B. Vatashsky
S. Ullman
200
13
0
20 Nov 2018
PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition
Haoxuan You
Yifan Feng
Rongrong Ji
Yue Gao
3DPC
226
182
0
23 Aug 2018
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu
Lei Ji
Wei Zhang
Nan Duan
M. Zhou
Jianyong Wang
CoGe
104
81
0
24 May 2018
1