ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06676
  4. Cited By
MUTAN: Multimodal Tucker Fusion for Visual Question Answering

MUTAN: Multimodal Tucker Fusion for Visual Question Answering

18 May 2017
H. Ben-younes
Rémi Cadène
Matthieu Cord
Nicolas Thome
ArXivPDFHTML

Papers citing "MUTAN: Multimodal Tucker Fusion for Visual Question Answering"

50 / 272 papers shown
Title
Semantic Equivalent Adversarial Data Augmentation for Visual Question
  Answering
Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering
Ruixue Tang
Chao Ma
W. Zhang
Qi Wu
Xiaokang Yang
OOD
21
48
0
19 Jul 2020
Reducing Language Biases in Visual Question Answering with
  Visually-Grounded Question Encoder
Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder
K. Gouthaman
Anurag Mittal
42
78
0
13 Jul 2020
Improving VQA and its Explanations \\ by Comparing Competing
  Explanations
Improving VQA and its Explanations \\ by Comparing Competing Explanations
Jialin Wu
Liyan Chen
Raymond J. Mooney
FAtt
AAML
33
17
0
28 Jun 2020
Overcoming Statistical Shortcuts for Open-ended Visual Counting
Overcoming Statistical Shortcuts for Open-ended Visual Counting
Corentin Dancette
Rémi Cadène
Xinlei Chen
Matthieu Cord
8
3
0
17 Jun 2020
Generalising Recursive Neural Models by Tensor Decomposition
Generalising Recursive Neural Models by Tensor Decomposition
Daniele Castellana
D. Bacciu
12
4
0
17 Jun 2020
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual
  Question Answering
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Zihao Zhu
J. Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
17
125
0
16 Jun 2020
Counterfactual VQA: A Cause-Effect Look at Language Bias
Counterfactual VQA: A Cause-Effect Look at Language Bias
Yulei Niu
Kaihua Tang
Hanwang Zhang
Zhiwu Lu
Xiansheng Hua
Ji-Rong Wen
CML
36
394
0
08 Jun 2020
M2P2: Multimodal Persuasion Prediction using Adaptive Fusion
M2P2: Multimodal Persuasion Prediction using Adaptive Fusion
Chongyang Bai
Haipeng Chen
Srijan Kumar
J. Leskovec
V. S. Subrahmanian
12
10
0
03 Jun 2020
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal
  Transformer
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
Vladimir E. Iashin
Esa Rahtu
14
126
0
17 May 2020
A novel multimodal approach for hybrid brain-computer interface
A novel multimodal approach for hybrid brain-computer interface
Zhe Sun
Zihao Huang
F. Duan
Yu Liu
13
40
0
25 Apr 2020
Deep Multimodal Neural Architecture Search
Deep Multimodal Neural Architecture Search
Zhou Yu
Yuhao Cui
Jun-chen Yu
Meng Wang
Dacheng Tao
Qi Tian
11
98
0
25 Apr 2020
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
Duy-Kien Nguyen
Vedanuj Goswami
Xinlei Chen
25
23
0
24 Apr 2020
Rephrasing visual questions by specifying the entropy of the answer
  distribution
Rephrasing visual questions by specifying the entropy of the answer distribution
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
S. Satoh
OOD
19
2
0
10 Apr 2020
Query-controllable Video Summarization
Query-controllable Video Summarization
Jia-Hong Huang
M. Worring
6
46
0
07 Apr 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal
  Transformers
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
30
434
0
02 Apr 2020
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene
  Text
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Difei Gao
Ke Li
Ruiping Wang
Shiguang Shan
Xilin Chen
12
111
0
31 Mar 2020
GPS-Net: Graph Property Sensing Network for Scene Graph Generation
GPS-Net: Graph Property Sensing Network for Scene Graph Generation
Xin Lin
Changxing Ding
Jinquan Zeng
Dacheng Tao
31
277
0
29 Mar 2020
CurlingNet: Compositional Learning between Images and Text for Fashion
  IQ Data
CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data
Youngjae Yu
Seunghwan Lee
Yuncheol Choi
Gunhee Kim
CoGe
11
37
0
27 Mar 2020
Linguistically Driven Graph Capsule Network for Visual Question
  Reasoning
Linguistically Driven Graph Capsule Network for Visual Question Reasoning
Qingxing Cao
Xiaodan Liang
Keze Wang
Liang Lin
GNN
13
3
0
23 Mar 2020
RSVQA: Visual Question Answering for Remote Sensing Data
RSVQA: Visual Question Answering for Remote Sensing Data
Sylvain Lobry
Diego Marcos
J. Murray
D. Tuia
64
205
0
16 Mar 2020
A Question-Centric Model for Visual Question Answering in Medical
  Imaging
A Question-Centric Model for Visual Question Answering in Medical Imaging
Minh H. Vu
Tommy Löfstedt
T. Nyholm
Raphael Sznitman
MedIm
6
59
0
02 Mar 2020
Tensor Decompositions in Deep Learning
Tensor Decompositions in Deep Learning
D. Bacciu
Danilo P. Mandic
25
14
0
26 Feb 2020
CQ-VQA: Visual Question Answering on Categorized Questions
CQ-VQA: Visual Question Answering on Categorized Questions
Aakansha Mishra
A. Anand
Prithwijit Guha
25
6
0
17 Feb 2020
Sparse and Structured Visual Attention
Sparse and Structured Visual Attention
Pedro Henrique Martins
S. Becker
Zita Marinho
Michael Arens
27
8
0
13 Feb 2020
Component Analysis for Visual Question Answering Architectures
Component Analysis for Visual Question Answering Architectures
Camila Kolling
Jonatas Wehrmann
Rodrigo C. Barros
CoGe
13
2
0
12 Feb 2020
H-OWAN: Multi-distorted Image Restoration with Tensor 1x1 Convolution
H-OWAN: Multi-distorted Image Restoration with Tensor 1x1 Convolution
Zihao Huang
Chao Li
Feng Duan
Qibin Zhao
14
5
0
29 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
21
17
0
20 Jan 2020
Modality-Balanced Models for Visual Dialogue
Modality-Balanced Models for Visual Dialogue
Hyounghun Kim
Hao Tan
Mohit Bansal
20
27
0
17 Jan 2020
Fine-grained Image Classification and Retrieval by Combining Visual and
  Locally Pooled Textual Features
Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features
Andrés Mafla
S. Dey
Ali Furkan Biten
Lluís Gómez
Dimosthenis Karatzas
6
26
0
14 Jan 2020
Visual Question Answering on 360° Images
Visual Question Answering on 360° Images
Shih-Han Chou
Wei-Lun Chao
Wei-Sheng Lai
Min Sun
Ming-Hsuan Yang
14
20
0
10 Jan 2020
Weak Supervision helps Emergence of Word-Object Alignment and improves
  Vision-Language Tasks
Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
19
14
0
06 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDL
OOD
UQCV
14
14
0
02 Dec 2019
Assessing the Robustness of Visual Question Answering Models
Assessing the Robustness of Visual Question Answering Models
Jia-Hong Huang
Modar Alfadly
Bernard Ghanem
M. Worring
AAML
OOD
15
23
0
30 Nov 2019
MMTM: Multimodal Transfer Module for CNN Fusion
MMTM: Multimodal Transfer Module for CNN Fusion
Hamid Reza Vaezi Joze
Amirreza Shaban
Michael L. Iuzzolino
K. Koishida
18
276
0
20 Nov 2019
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Badri N. Patro
Anupriy
Vinay P. Namboodiri
AAML
FAtt
34
26
0
19 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal
  Transformers for TextVQA
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
18
195
0
14 Nov 2019
Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation
Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation
Yiming Xu
Lin Chen
Zhongwei Cheng
Lixin Duan
Jiebo Luo
OOD
24
24
0
11 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion,
  and Applications
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
27
320
0
10 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
A. Schwing
LRM
ReLM
26
9
0
31 Oct 2019
Heterogeneous Graph Learning for Visual Commonsense Reasoning
Heterogeneous Graph Learning for Visual Commonsense Reasoning
Weijiang Yu
Jingwen Zhou
Weihao Yu
Xiaodan Liang
Nong Xiao
LRM
17
46
0
25 Oct 2019
KnowIT VQA: Answering Knowledge-Based Questions about Videos
KnowIT VQA: Answering Knowledge-Based Questions about Videos
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
9
76
0
23 Oct 2019
Enforcing Reasoning in Visual Commonsense Reasoning
Enforcing Reasoning in Visual Commonsense Reasoning
Hammad A. Ayyubi
Md. Mehrab Tanjim
D. Kriegman
ReLM
OOD
14
2
0
21 Oct 2019
Multi-modal Deep Analysis for Multimedia
Multi-modal Deep Analysis for Multimedia
Wenwu Zhu
Xin Eric Wang
Hongzhi Li
19
38
0
11 Oct 2019
REMIND Your Neural Network to Prevent Catastrophic Forgetting
REMIND Your Neural Network to Prevent Catastrophic Forgetting
Tyler L. Hayes
Kushal Kafle
Robik Shrestha
Manoj Acharya
Christopher Kanan
CLL
29
294
0
06 Oct 2019
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video
  Moment Retrieval
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval
Reuben Tan
Huijuan Xu
Kate Saenko
Bryan A. Plummer
15
67
0
27 Sep 2019
Compact Trilinear Interaction for Visual Question Answering
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
28
59
0
26 Sep 2019
Explainable High-order Visual Question Reasoning: A New Benchmark and
  Knowledge-routed Network
Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network
Qingxing Cao
Bailin Li
Xiaodan Liang
Liang Lin
25
13
0
23 Sep 2019
Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event
  Captioning
Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event Captioning
Tanzila Rahman
Bicheng Xu
Leonid Sigal
19
77
0
22 Sep 2019
Inverse Visual Question Answering with Multi-Level Attentions
Inverse Visual Question Answering with Multi-Level Attentions
Yaser Alwatter
Yuhong Guo
BDL
14
1
0
17 Sep 2019
Probabilistic framework for solving Visual Dialog
Probabilistic framework for solving Visual Dialog
Badri N. Patro
Anupriy
Vinay P. Namboodiri
BDL
22
13
0
11 Sep 2019
Previous
123456
Next