ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.02632
  4. Cited By
Question-Guided Hybrid Convolution for Visual Question Answering

Question-Guided Hybrid Convolution for Visual Question Answering

8 August 2018
Shiyang Feng
Pan Lu
Jiaming Song
Shuang Li
Yikang Li
Guosheng Lin
Xiaogang Wang
ArXiv (abs)PDFHTML

Papers citing "Question-Guided Hybrid Convolution for Visual Question Answering"

33 / 33 papers shown
Title
Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends
Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future TrendsIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Yunshi Lan
Xinyuan Li
Hanyue Du
Xuesong Lu
Ming Gao
Weining Qian
Aoying Zhou
371
12
0
15 Jan 2024
From Image to Language: A Critical Analysis of Visual Question Answering
  (VQA) Approaches, Challenges, and Opportunities
From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and OpportunitiesInformation Fusion (Inf. Fusion), 2023
Md Farhan Ishmam
Md Sakib Hossain Shovon
M. F. Mridha
Nilanjan Dey
360
68
0
01 Nov 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question AnsweringNeural Information Processing Systems (NeurIPS), 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELMReLMLRM
514
1,808
0
20 Sep 2022
DM$^2$S$^2$: Deep Multi-Modal Sequence Sets with Hierarchical Modality
  Attention
DM2^22S2^22: Deep Multi-Modal Sequence Sets with Hierarchical Modality AttentionIEEE Access (IEEE Access), 2022
Shunsuke Kitada
Yuki Iwazaki
Riku Togashi
Hitoshi Iyatomi
253
2
0
07 Sep 2022
Recent, rapid advancement in visual question answering architecture: a
  review
Recent, rapid advancement in visual question answering architecture: a reviewIEEE International Conference on Electro/Information Technology (EIT), 2022
V. Kodali
Daniel Berleant
254
9
0
02 Mar 2022
Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in
  Visual Question Answering
Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering
Jianjian Cao
Xiameng Qin
Sanyuan Zhao
Jianbing Shen
160
27
0
14 Dec 2021
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual
  Language Reasoning
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning
Pan Lu
Liang Qiu
Jiaqi Chen
Tony Xia
Yizhou Zhao
Wei Zhang
Zhou Yu
Xiaodan Liang
Song-Chun Zhu
AIMat
344
254
0
25 Oct 2021
Towards Language-guided Visual Recognition via Dynamic Convolutions
Towards Language-guided Visual Recognition via Dynamic Convolutions
Gen Luo
Weihao Ye
Xiaoshuai Sun
Yongjian Wu
Yue Gao
Rongrong Ji
ObjD
200
25
0
17 Oct 2021
Fast Convergence of DETR with Spatially Modulated Co-Attention
Fast Convergence of DETR with Spatially Modulated Co-AttentionIEEE International Conference on Computer Vision (ICCV), 2021
Shiyang Feng
Minghang Zheng
Xiaogang Wang
Jifeng Dai
Jiaming Song
ViT
213
360
0
05 Aug 2021
Dynamic Neural Networks: A Survey
Dynamic Neural Networks: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yizeng Han
Gao Huang
Shiji Song
Le Yang
Honghui Wang
Yulin Wang
3DHAI4TSAI4CE
371
786
0
09 Feb 2021
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A ReviewNeurocomputing (Neurocomputing), 2020
Wei Chen
Weiping Wang
Tianpeng Liu
M. Lew
VLM
297
35
0
16 Oct 2020
Contrastive Visual-Linguistic Pretraining
Contrastive Visual-Linguistic Pretraining
Lei Shi
Kai Shuang
Shijie Geng
Peng Su
Zhengkai Jiang
Shiyang Feng
Zuohui Fu
Gerard de Melo
Sen Su
VLMSSLCLIP
144
29
0
26 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal
  Shuffled Transformers
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Shiyang Feng
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Zelong Li
Jiaming Song
A. Cherian
202
11
0
08 Jul 2020
Location Sensitive Image Retrieval and Tagging
Location Sensitive Image Retrieval and Tagging
Raul Gomez
J. Gibert
Lluís Gómez
Dimosthenis Karatzas
211
4
0
07 Jul 2020
Extreme Low-Light Imaging with Multi-granulation Cooperative Networks
Extreme Low-Light Imaging with Multi-granulation Cooperative Networks
Keqi Wang
Shiyang Feng
Guosheng Lin
Qian Guo
Y. Qian
127
4
0
16 May 2020
Character Matters: Video Story Understanding with Character-Aware
  Relations
Character Matters: Video Story Understanding with Character-Aware Relations
Shijie Geng
Ji Zhang
Zuohui Fu
Shiyang Feng
Hang Zhang
Gerard de Melo
172
11
0
09 May 2020
Modulating Bottom-Up and Top-Down Visual Processing via
  Language-Conditional Filters
Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filters
.Ilker Kesen
Ozan Arkan Can
Erkut Erdem
Aykut Erdem
Deniz Yuret
VLM
145
2
0
28 Mar 2020
Multi-Layer Content Interaction Through Quaternion Product For Visual
  Question Answering
Multi-Layer Content Interaction Through Quaternion Product For Visual Question AnsweringIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Lei Shi
Shijie Geng
Kai Shuang
Chiori Hori
Songxiang Liu
Shiyang Feng
Sen Su
229
12
0
03 Jan 2020
Learning Depth-Guided Convolutions for Monocular 3D Object Detection
Learning Depth-Guided Convolutions for Monocular 3D Object Detection
Mingyu Ding
Yuqi Huo
Hongwei Yi
Zhe Wang
Jianping Shi
Zhiwu Lu
Ping Luo
3DPC
211
351
0
10 Dec 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in
  Visual Dialogue
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual DialogueAAAI Conference on Artificial Intelligence (AAAI), 2019
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
159
70
0
17 Nov 2019
Cross Attention Network for Few-shot Classification
Cross Attention Network for Few-shot ClassificationNeural Information Processing Systems (NeurIPS), 2019
Rui Hou
Hong Chang
Bingpeng Ma
Shiguang Shan
Xilin Chen
406
736
0
17 Oct 2019
Exploring Hate Speech Detection in Multimodal Publications
Exploring Hate Speech Detection in Multimodal PublicationsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2019
Raul Gomez
J. Gibert
Lluís Gómez
Dimosthenis Karatzas
128
264
0
09 Oct 2019
Multi-modality Latent Interaction Network for Visual Question Answering
Multi-modality Latent Interaction Network for Visual Question AnsweringIEEE International Conference on Computer Vision (ICCV), 2019
Shiyang Feng
Haoxuan You
Zhanpeng Zhang
Xiaogang Wang
Jiaming Song
139
85
0
10 Aug 2019
Language-Conditioned Graph Networks for Relational Reasoning
Language-Conditioned Graph Networks for Relational ReasoningIEEE International Conference on Computer Vision (ICCV), 2019
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
170
182
0
10 May 2019
Question Guided Modular Routing Networks for Visual Question Answering
Question Guided Modular Routing Networks for Visual Question Answering
Yanze Wu
Qiang Sun
Jianqi Ma
Bin Li
Yanwei Fu
Yao Peng
Xiangyang Xue
201
2
0
17 Apr 2019
Improving Referring Expression Grounding with Cross-modal
  Attention-guided Erasing
Improving Referring Expression Grounding with Cross-modal Attention-guided ErasingComputer Vision and Pattern Recognition (CVPR), 2019
Xihui Liu
Zihao Wang
Jing Shao
Xiaogang Wang
Jiaming Song
ObjD
252
211
0
03 Mar 2019
FishNet: A Versatile Backbone for Image, Region, and Pixel Level
  Prediction
FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction
Shuyang Sun
Jiangmiao Pang
Jianping Shi
Shuai Yi
Wanli Ouyang
195
102
0
11 Jan 2019
A^2-Net: Molecular Structure Estimation from Cryo-EM Density Volumes
A^2-Net: Molecular Structure Estimation from Cryo-EM Density Volumes
Kui Xu
Zhe Wang
Jianping Shi
Jiaming Song
Q. Zhang
3DV
148
48
0
03 Jan 2019
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual
  Question Answering
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Shiyang Feng
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Jiaming Song
AIMat
408
393
0
13 Dec 2018
PVRNet: Point-View Relation Neural Network for 3D Shape Recognition
PVRNet: Point-View Relation Neural Network for 3D Shape Recognition
Haoxuan You
Yifan Feng
Xibin Zhao
C. Zou
Rongrong Ji
Yue Gao
3DPC
118
72
0
02 Dec 2018
VQA with no questions-answers training
VQA with no questions-answers trainingComputer Vision and Pattern Recognition (CVPR), 2018
B. Vatashsky
S. Ullman
200
13
0
20 Nov 2018
PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for
  3D Shape Recognition
PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition
Haoxuan You
Yifan Feng
Rongrong Ji
Yue Gao
3DPC
226
182
0
23 Aug 2018
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual
  Question Answering
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu
Lei Ji
Wei Zhang
Nan Duan
M. Zhou
Jianyong Wang
CoGe
104
81
0
24 May 2018
1