ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.03619
  4. Cited By
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling
  for Visual Question Answering

Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering

10 August 2017
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Jianping Fan
Dacheng Tao
ArXivPDFHTML

Papers citing "Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering"

25 / 25 papers shown
Title
Hadamard product in deep learning: Introduction, Advances and Challenges
Hadamard product in deep learning: Introduction, Advances and Challenges
Grigorios G. Chrysos
Yongtao Wu
Razvan Pascanu
Philip Torr
V. Cevher
AAML
96
0
0
17 Apr 2025
Generalizable Prompt Learning of CLIP: A Brief Overview
Generalizable Prompt Learning of CLIP: A Brief Overview
Fangming Cui
Yonggang Zhang
Xuan Wang
Xule Wang
Liang Xiao
VPVLM
VLM
117
0
0
03 Mar 2025
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document
  Image Classification
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marçal Rusiñol
8
6
0
11 May 2023
AutoFraudNet: A Multimodal Network to Detect Fraud in the Auto Insurance
  Industry
AutoFraudNet: A Multimodal Network to Detect Fraud in the Auto Insurance Industry
Azin Asgarian
Rohit Saha
Daniel Jakubovitz
Julia Peyre
21
2
0
15 Jan 2023
Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary
  Object Detection
Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection
Yanxin Long
Jianhua Han
Runhu Huang
Xu Hang
Yi Zhu
Chunjing Xu
Xiaodan Liang
VLM
ObjD
22
18
0
02 Nov 2022
Locate before Answering: Answer Guided Question Localization for Video
  Question Answering
Locate before Answering: Answer Guided Question Localization for Video Question Answering
Tianwen Qian
Ran Cui
Jingjing Chen
Pai Peng
Xiao-Wei Guo
Yu-Gang Jiang
12
17
0
05 Oct 2022
MuKEA: Multimodal Knowledge Extraction and Accumulation for
  Knowledge-based Visual Question Answering
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
Yang Ding
Jing Yu
Bangchang Liu
Yue Hu
Mingxin Cui
Qi Wu
11
62
0
17 Mar 2022
Recent, rapid advancement in visual question answering architecture: a
  review
Recent, rapid advancement in visual question answering architecture: a review
V. Kodali
Daniel Berleant
27
9
0
02 Mar 2022
Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in
  Visual Question Answering
Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering
Jianjian Cao
Xiameng Qin
Sanyuan Zhao
Jianbing Shen
23
20
0
14 Dec 2021
How to find a good image-text embedding for remote sensing visual
  question answering?
How to find a good image-text embedding for remote sensing visual question answering?
Christel Chappuis
Sylvain Lobry
B. Kellenberger
Bertrand Le Saux
D. Tuia
30
20
0
24 Sep 2021
Discovering the Unknown Knowns: Turning Implicit Knowledge in the
  Dataset into Explicit Training Examples for Visual Question Answering
Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering
Jihyung Kil
Cheng Zhang
D. Xuan
Wei-Lun Chao
56
20
0
13 Sep 2021
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and
  Intra-modal Knowledge Integration
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
Yuhao Cui
Zhou Yu
Chunqi Wang
Zhongzhou Zhao
Ji Zhang
Meng Wang
Jun-chen Yu
VLM
19
52
0
16 Aug 2021
Biomedical Question Answering: A Survey of Approaches and Challenges
Biomedical Question Answering: A Survey of Approaches and Challenges
Qiao Jin
Zheng Yuan
Guangzhi Xiong
Qian Yu
Huaiyuan Ying
Chuanqi Tan
Mosha Chen
Songfang Huang
Xiaozhong Liu
Sheng Yu
21
95
0
10 Feb 2021
An Improved Attention for Visual Question Answering
An Improved Attention for Visual Question Answering
Tanzila Rahman
Shih-Han Chou
Leonid Sigal
Giuseppe Carenini
13
42
0
04 Nov 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
21
17
0
20 Jan 2020
Modulated Self-attention Convolutional Network for VQA
Modulated Self-attention Convolutional Network for VQA
Jean-Benoit Delbrouck
Antoine Maiorca
Nathan Hubens
Stéphane Dupont
13
1
0
08 Oct 2019
DNN-based cross-lingual voice conversion using Bottleneck Features
DNN-based cross-lingual voice conversion using Bottleneck Features
M. K. Reddy
K. S. Rao
18
4
0
09 Sep 2019
LXMERT: Learning Cross-Modality Encoder Representations from
  Transformers
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Mohit Bansal
VLM
MLLM
52
2,444
0
20 Aug 2019
Zero-Shot Grounding of Objects from Natural Language Queries
Zero-Shot Grounding of Objects from Natural Language Queries
Arka Sadhu
Kan Chen
Ram Nevatia
ObjD
20
156
0
20 Aug 2019
Attentional Feature-Pair Relation Networks for Accurate Face Recognition
Attentional Feature-Pair Relation Networks for Accurate Face Recognition
Bong-Nam Kang
Yonghyun Kim
Bongjin Jun
Daijin Kim
CVBM
9
37
0
17 Aug 2019
LoRMIkA: Local rule-based model interpretability with k-optimal
  associations
LoRMIkA: Local rule-based model interpretability with k-optimal associations
Dilini Sewwandi Rajapaksha
Christoph Bergmeir
Wray L. Buntine
19
30
0
11 Aug 2019
An Empirical Study on Leveraging Scene Graphs for Visual Question
  Answering
An Empirical Study on Leveraging Scene Graphs for Visual Question Answering
Cheng Zhang
Wei-Lun Chao
D. Xuan
21
50
0
28 Jul 2019
Frontal Low-rank Random Tensors for Fine-grained Action Segmentation
Frontal Low-rank Random Tensors for Fine-grained Action Segmentation
Yan Zhang
Krikamol Muandet
Qianli Ma
Heiko Neumann
Siyu Tang
18
3
0
03 Jun 2019
Rethinking Diversified and Discriminative Proposal Generation for Visual
  Grounding
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Zhou Zhao
Q. Tian
Dacheng Tao
ObjD
13
138
0
09 May 2018
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,464
0
06 Jun 2016
1