ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.01471
  4. Cited By
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for
  Visual Question Answering

Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering

4 August 2017
Zhou Yu
Jun-chen Yu
Jianping Fan
Dacheng Tao
ArXivPDFHTML

Papers citing "Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering"

50 / 214 papers shown
Title
Fine-grained Image Classification and Retrieval by Combining Visual and
  Locally Pooled Textual Features
Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features
Andrés Mafla
S. Dey
Ali Furkan Biten
Lluís Gómez
Dimosthenis Karatzas
8
26
0
14 Jan 2020
In Defense of Grid Features for Visual Question Answering
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
21
318
0
10 Jan 2020
Low Rank Factorization for Compact Multi-Head Self-Attention
Low Rank Factorization for Compact Multi-Head Self-Attention
Sneha Mehta
Huzefa Rangwala
Naren Ramakrishnan
25
5
0
26 Nov 2019
Efficient Attention Mechanism for Visual Dialog that can Handle All the
  Interactions between Multiple Inputs
Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
16
7
0
26 Nov 2019
Two Causal Principles for Improving Visual Dialog
Two Causal Principles for Improving Visual Dialog
Jiaxin Qi
Yulei Niu
Jianqiang Huang
Hanwang Zhang
CML
16
146
0
24 Nov 2019
Unsupervised Keyword Extraction for Full-sentence VQA
Unsupervised Keyword Extraction for Full-sentence VQA
Kohei Uehara
Tatsuya Harada
14
1
0
23 Nov 2019
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via
  Iterative Multi-agent Communication
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication
Ruize Wang
Zhongyu Wei
Ying Cheng
Piji Li
Haijun Shan
Ji Zhang
Qi Zhang
Xuanjing Huang
VGen
DiffM
15
13
0
11 Nov 2019
Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation
Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation
Yiming Xu
Lin Chen
Zhongwei Cheng
Lixin Duan
Jiebo Luo
OOD
24
24
0
11 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion,
  and Applications
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
27
320
0
10 Nov 2019
Two-Headed Monster And Crossed Co-Attention Networks
Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran Li
Jing Jiang
19
0
0
10 Nov 2019
Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video
  Captioning
Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning
Tao Jin
Siyu Huang
Yingming Li
Zhongfei Zhang
12
20
0
01 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
A. Schwing
LRM
ReLM
26
9
0
31 Oct 2019
Multi-modal Deep Analysis for Multimedia
Multi-modal Deep Analysis for Multimedia
Wenwu Zhu
Xin Eric Wang
Hongzhi Li
19
38
0
11 Oct 2019
Meta Module Network for Compositional Visual Reasoning
Meta Module Network for Compositional Visual Reasoning
Wenhu Chen
Zhe Gan
Linjie Li
Yu Cheng
W. Wang
Jingjing Liu
LRM
17
68
0
08 Oct 2019
Compact Trilinear Interaction for Visual Question Answering
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
28
59
0
26 Sep 2019
Explainable High-order Visual Question Reasoning: A New Benchmark and
  Knowledge-routed Network
Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network
Qingxing Cao
Bailin Li
Xiaodan Liang
Liang Lin
25
13
0
23 Sep 2019
Inverse Visual Question Answering with Multi-Level Attentions
Inverse Visual Question Answering with Multi-Level Attentions
Yaser Alwatter
Yuhong Guo
BDL
19
1
0
17 Sep 2019
Phrase Grounding by Soft-Label Chain Conditional Random Field
Phrase Grounding by Soft-Label Chain Conditional Random Field
Jiacheng Liu
J. Hockenmaier
10
10
0
01 Sep 2019
Attention-based Fusion for Outfit Recommendation
Attention-based Fusion for Outfit Recommendation
Katrien Laenen
Marie-Francine Moens
CVBM
12
7
0
28 Aug 2019
Mobile Video Action Recognition
Mobile Video Action Recognition
Yuqi Huo
Xiaoli Xu
Yao Lu
Yulei Niu
Zhiwu Lu
Ji-Rong Wen
17
14
0
27 Aug 2019
Zero-Shot Grounding of Objects from Natural Language Queries
Zero-Shot Grounding of Objects from Natural Language Queries
Arka Sadhu
Kan Chen
Ram Nevatia
ObjD
28
156
0
20 Aug 2019
Mixed High-Order Attention Network for Person Re-Identification
Mixed High-Order Attention Network for Person Re-Identification
Binghui Chen
Weihong Deng
Jiani Hu
CVBM
9
353
0
16 Aug 2019
Multimodal Unified Attention Networks for Vision-and-Language
  Interactions
Multimodal Unified Attention Networks for Vision-and-Language Interactions
Zhou Yu
Yuhao Cui
Jun Yu
Dacheng Tao
Q. Tian
19
38
0
12 Aug 2019
Bilinear Graph Networks for Visual Question Answering
Bilinear Graph Networks for Visual Question Answering
Dalu Guo
Chang Xu
Dacheng Tao
GNN
27
50
0
23 Jul 2019
The Resale Price Prediction of Secondhand Jewelry Items Using a
  Multi-modal Deep Model with Iterative Co-Attention
The Resale Price Prediction of Secondhand Jewelry Items Using a Multi-modal Deep Model with Iterative Co-Attention
Yusuke Yamaura
Nobuya Kanemaki
Y. Tsuboshita
10
3
0
01 Jul 2019
Deep Modular Co-Attention Networks for Visual Question Answering
Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
13
796
0
25 Jun 2019
Audio-Visual Kinship Verification
Audio-Visual Kinship Verification
Xiaoting Wu
Eric Granger
Xiaoyi Feng
CVBM
14
3
0
24 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph
  Captions
Improving Visual Question Answering by Referring to Generated Paragraph Captions
Hyounghun Kim
Mohit Bansal
CoGe
11
20
0
14 Jun 2019
Relationship-Embedded Representation Learning for Grounding Referring
  Expressions
Relationship-Embedded Representation Learning for Grounding Referring Expressions
Sibei Yang
Guanbin Li
Yizhou Yu
ObjD
25
52
0
11 Jun 2019
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via
  Question Answering
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
Zhou Yu
D. Xu
Jun-chen Yu
Ting Yu
Zhou Zhao
Yueting Zhuang
Dacheng Tao
8
434
0
06 Jun 2019
Frontal Low-rank Random Tensors for Fine-grained Action Segmentation
Frontal Low-rank Random Tensors for Fine-grained Action Segmentation
Yan Zhang
Krikamol Muandet
Qianli Ma
Heiko Neumann
Siyu Tang
26
3
0
03 Jun 2019
Multimodal Transformer with Multi-View Visual Representation for Image
  Captioning
Multimodal Transformer with Multi-View Visual Representation for Image Captioning
Jun-chen Yu
Jing Li
Zhou Yu
Qingming Huang
ViT
11
374
0
20 May 2019
Quantifying and Alleviating the Language Prior Problem in Visual
  Question Answering
Quantifying and Alleviating the Language Prior Problem in Visual Question Answering
Yangyang Guo
Zhiyong Cheng
Liqiang Nie
Y. Liu
Yinglong Wang
Mohan S. Kankanhalli
14
36
0
13 May 2019
HAR-Net: Joint Learning of Hybrid Attention for Single-stage Object
  Detection
HAR-Net: Joint Learning of Hybrid Attention for Single-stage Object Detection
Yali Li
Shengjin Wang
22
32
0
25 Apr 2019
Progressive Attention Memory Network for Movie Story Question Answering
Progressive Attention Memory Network for Movie Story Question Answering
Junyeong Kim
Minuk Ma
Kyungsu Kim
Sungjin Kim
Chang-Dong Yoo
11
76
0
18 Apr 2019
Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based
  Image Retrieval
Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval
Anjan Dutta
Zeynep Akata
33
143
0
08 Mar 2019
Image-Question-Answer Synergistic Network for Visual Dialog
Image-Question-Answer Synergistic Network for Visual Dialog
Dalu Guo
Chang Xu
Dacheng Tao
6
74
0
26 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
19
271
0
25 Feb 2019
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
Zhe Gan
Yu Cheng
Ahmed El Kholy
Linjie Li
Jingjing Liu
Jianfeng Gao
11
104
0
01 Feb 2019
BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and
  Visual Relationship Detection
BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection
H. Ben-younes
Rémi Cadène
Nicolas Thome
Matthieu Cord
14
218
0
31 Jan 2019
Deep Fusion: An Attention Guided Factorized Bilinear Pooling for
  Audio-video Emotion Recognition
Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio-video Emotion Recognition
Yuanyuan Zhang
Zirui Wang
Jun Du
11
31
0
15 Jan 2019
Local Temporal Bilinear Pooling for Fine-grained Action Parsing
Local Temporal Bilinear Pooling for Fine-grained Action Parsing
Yan Zhang
Siyu Tang
Krikamol Muandet
Christian Jarvers
Heiko Neumann
13
21
0
05 Dec 2018
Generating Easy-to-Understand Referring Expressions for Target
  Identifications
Generating Easy-to-Understand Referring Expressions for Target Identifications
Mikihiro Tanaka
Takayuki Itamochi
Kenichi Narioka
Ikuro Sato
Yoshitaka Ushiku
Tatsuya Harada
8
1
0
29 Nov 2018
Visual Question Answering as Reading Comprehension
Visual Question Answering as Reading Comprehension
Hui Li
Peng Wang
Chunhua Shen
A. Hengel
9
40
0
29 Nov 2018
VQA with no questions-answers training
VQA with no questions-answers training
B. Vatashsky
S. Ullman
33
12
0
20 Nov 2018
EA-LSTM: Evolutionary Attention-based LSTM for Time Series Prediction
EA-LSTM: Evolutionary Attention-based LSTM for Time Series Prediction
Youru Li
Zhenfeng Zhu
Deqiang Kong
Jinhyuk Lee
Yao Zhao
AI4TS
23
354
0
09 Nov 2018
Textbook Question Answering with Multi-modal Context Graph Understanding
  and Self-supervised Open-set Comprehension
Textbook Question Answering with Multi-modal Context Graph Understanding and Self-supervised Open-set Comprehension
Daesik Kim
Seonhoon Kim
Nojun Kwak
9
2
0
01 Nov 2018
Understand, Compose and Respond - Answering Visual Questions by a
  Composition of Abstract Procedures
Understand, Compose and Respond - Answering Visual Questions by a Composition of Abstract Procedures
B. Vatashsky
S. Ullman
CoGe
18
1
0
25 Oct 2018
Towards Good Practices for Multi-modal Fusion in Large-scale Video
  Classification
Towards Good Practices for Multi-modal Fusion in Large-scale Video Classification
Jinlai Liu
Zehuan Yuan
Changhu Wang
16
9
0
16 Sep 2018
Interpretable Visual Question Answering by Reasoning on Dependency Trees
Interpretable Visual Question Answering by Reasoning on Dependency Trees
Qingxing Cao
Bailin Li
Xiaodan Liang
Liang Lin
25
55
0
06 Sep 2018
Previous
12345
Next