ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.09406
  4. Cited By
Multimodal Machine Learning: A Survey and Taxonomy
v1v2 (latest)

Multimodal Machine Learning: A Survey and Taxonomy

26 May 2017
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
ArXiv (abs)PDFHTML

Papers citing "Multimodal Machine Learning: A Survey and Taxonomy"

50 / 941 papers shown
General Greedy De-bias Learning
General Greedy De-bias LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Xinzhe Han
Shuhui Wang
Chi Su
Qingming Huang
Qi Tian
480
18
0
20 Dec 2021
Dual-Key Multimodal Backdoors for Visual Question Answering
Dual-Key Multimodal Backdoors for Visual Question Answering
Matthew Walmer
Karan Sikka
Indranil Sur
Abhinav Shrivastava
Susmit Jha
AAML
161
46
0
14 Dec 2021
Multi-Modal Perception Attention Network with Self-Supervised Learning
  for Audio-Visual Speaker Tracking
Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking
Yidi Li
Hong Liu
Hao Tang
241
23
0
14 Dec 2021
Data Collection and Quality Challenges in Deep Learning: A Data-Centric
  AI Perspective
Data Collection and Quality Challenges in Deep Learning: A Data-Centric AI Perspective
Steven Euijong Whang
Yuji Roh
Hwanjun Song
Jae-Gil Lee
452
463
0
13 Dec 2021
A graph representation based on fluid diffusion model for data analysis:
  theoretical aspects and enhanced community detection
A graph representation based on fluid diffusion model for data analysis: theoretical aspects and enhanced community detection
Andrea Marinoni
Christian Jutten
Mark Girolami
361
2
0
07 Dec 2021
Contrastive Cycle Adversarial Autoencoders for Single-cell Multi-omics
  Alignment and Integration
Contrastive Cycle Adversarial Autoencoders for Single-cell Multi-omics Alignment and IntegrationbioRxiv (bioRxiv), 2021
Xuesong Wang
Zhihang Hu
Tingyang Yu
Ruijie Wang
Yumeng Wei
Juan Shu
Jianzhu Ma
Yu Li
226
4
0
05 Dec 2021
Active Sensing for Search and Tracking: A Review
Active Sensing for Search and Tracking: A Review
Luca Varotto
Angelo Cenedese
Andrea Cavallaro
162
14
0
04 Dec 2021
Channel Exchanging Networks for Multimodal and Multitask Dense Image
  Prediction
Channel Exchanging Networks for Multimodal and Multitask Dense Image PredictionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yikai Wang
Gang Hua
Wenbing Huang
Fengxiang He
Dacheng Tao
287
46
0
04 Dec 2021
Shapes of Emotions: Multimodal Emotion Recognition in Conversations via
  Emotion Shifts
Shapes of Emotions: Multimodal Emotion Recognition in Conversations via Emotion Shifts
Harsh Agarwal
Keshav Bansal
Abhinav Joshi
Ashutosh Modi
185
24
0
03 Dec 2021
Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video
  Retrieval
Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval
Fan Hu
Aozhu Chen
Ziyu Wang
Fangming Zhou
Jianfeng Dong
Xirong Li
218
45
0
03 Dec 2021
ContIG: Self-supervised Multimodal Contrastive Learning for Medical
  Imaging with Genetics
ContIG: Self-supervised Multimodal Contrastive Learning for Medical Imaging with GeneticsComputer Vision and Pattern Recognition (CVPR), 2021
Aiham Taleb
Matthias Kirchler
Remo Monti
Christoph Lippert
SSLMedIm
615
69
0
26 Nov 2021
Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet
  Convolutional Network
Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet Convolutional NetworkIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
M. Behmanesh
Peyman Adibi
M. Ehsani
Jocelyn Chanussot
213
24
0
26 Nov 2021
Sparse Fusion for Multimodal Transformers
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
169
9
0
23 Nov 2021
GN-Transformer: Fusing Sequence and Graph Representation for Improved
  Code Summarization
GN-Transformer: Fusing Sequence and Graph Representation for Improved Code Summarization
Junyan Cheng
Iordanis Fostiropoulos
Barry W. Boehm
123
10
0
17 Nov 2021
TorchGeo: Deep Learning With Geospatial Data
TorchGeo: Deep Learning With Geospatial Data
Adam J. Stewart
Caleb Robinson
Isaac Corley
Anthony Ortiz
J. L. Ferres
Arindam Banerjee
3DPC
375
105
0
17 Nov 2021
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
  Distributions
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma DistributionsNeural Information Processing Systems (NeurIPS), 2021
Huan Ma
Zongbo Han
Changqing Zhang
Huazhu Fu
Qiufeng Wang
Q. Hu
EDLUQCV
258
57
0
11 Nov 2021
A framework for comprehensible multi-modal detection of cyber threats
A framework for comprehensible multi-modal detection of cyber threats
J. Kohout
Cenek Skarda
Kyrylo Shcherbin
Martin Kopp
J. Brabec
107
1
0
10 Nov 2021
Social Fraud Detection Review: Methods, Challenges and Analysis
Social Fraud Detection Review: Methods, Challenges and Analysis
Saeedreza Shehnepoor
R. Togneri
Wei Liu
Bennamoun
AAML
220
4
0
10 Nov 2021
Cross Attentional Audio-Visual Fusion for Dimensional Emotion
  Recognition
Cross Attentional Audio-Visual Fusion for Dimensional Emotion RecognitionIEEE International Conference on Automatic Face & Gesture Recognition (FG), 2021
R Gnana Praveen
Mohammadhadi Shateri
P. Cardinal
CVBM
208
52
0
09 Nov 2021
How does a Pre-Trained Transformer Integrate Contextual Keywords?
  Application to Humanitarian Computing
How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing
Valentin Barrière
Guillaume Jacquet
87
1
0
07 Nov 2021
ML-PersRef: A Machine Learning-based Personalized Multimodal Fusion
  Approach for Referencing Outside Objects From a Moving Vehicle
ML-PersRef: A Machine Learning-based Personalized Multimodal Fusion Approach for Referencing Outside Objects From a Moving VehicleInternational Conference on Multimodal Interaction (ICMI), 2021
Amr Gomaa
Guillermo Reyes
Michael Feld
92
11
0
03 Nov 2021
A Comparative Study of Speaker Role Identification in Air Traffic
  Communication Using Deep Learning Approaches
A Comparative Study of Speaker Role Identification in Air Traffic Communication Using Deep Learning Approaches
Dongyue Guo
Jianwei Zhang
Bo Yang
Yi Lin
284
14
0
03 Nov 2021
A Survey on Epistemic (Model) Uncertainty in Supervised Learning: Recent
  Advances and Applications
A Survey on Epistemic (Model) Uncertainty in Supervised Learning: Recent Advances and Applications
Xinlei Zhou
Han Liu
Farhad Pourpanah
T. Zeng
Xizhao Wang
UQCVUD
321
73
0
03 Nov 2021
Latent Structure Mining with Contrastive Modality Fusion for Multimedia
  Recommendation
Latent Structure Mining with Contrastive Modality Fusion for Multimedia RecommendationIEEE Transactions on Knowledge and Data Engineering (TKDE), 2021
Jinghao Zhang
Yanqiao Zhu
Qiang Liu
Mengqi Zhang
Shu Wu
Liang Wang
278
79
0
01 Nov 2021
Revisit Multimodal Meta-Learning through the Lens of Multi-Task Learning
Revisit Multimodal Meta-Learning through the Lens of Multi-Task Learning
Milad Abdollahzadeh
Touba Malekzadeh
Ngai-Man Cheung
153
36
0
27 Oct 2021
Exploiting Cross-Modal Prediction and Relation Consistency for
  Semi-Supervised Image Captioning
Exploiting Cross-Modal Prediction and Relation Consistency for Semi-Supervised Image CaptioningIEEE Transactions on Cybernetics (IEEE Trans. Cybern.), 2021
Yang Yang
Haoran Wei
Hengshu Zhu
Dianhai Yu
Hui Xiong
Jian Yang
SSL
107
43
0
22 Oct 2021
Deep multi-modal aggregation network for MR image reconstruction with
  auxiliary modality
Deep multi-modal aggregation network for MR image reconstruction with auxiliary modality
Chun-Mei Feng
Huazhu Fu
Tianfei Zhou
Yong Xu
Ling Shao
David Zhang
225
10
0
15 Oct 2021
From Multimodal to Unimodal Attention in Transformers using Knowledge
  Distillation
From Multimodal to Unimodal Attention in Transformers using Knowledge Distillation
Dhruv Agarwal
Tanay Agrawal
Laura M. Ferrari
Franccois Bremond
141
5
0
15 Oct 2021
StreaMulT: Streaming Multimodal Transformer for Heterogeneous and
  Arbitrary Long Sequential Data
StreaMulT: Streaming Multimodal Transformer for Heterogeneous and Arbitrary Long Sequential Data
Victor Pellegrain
Myriam Tami
M. Batteux
C´eline Hudelot
AI4TS
197
3
0
15 Oct 2021
DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality
  Learning
DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning
Yizhi Wang
Zheng Lian
3DV
290
25
0
13 Oct 2021
Supervision Exists Everywhere: A Data Efficient Contrastive
  Language-Image Pre-training Paradigm
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training ParadigmInternational Conference on Learning Representations (ICLR), 2021
Yangguang Li
Feng Liang
Lichen Zhao
Yufeng Cui
Wanli Ouyang
Jing Shao
F. Yu
Junjie Yan
VLMCLIP
424
540
0
11 Oct 2021
Embed Everything: A Method for Efficiently Co-Embedding Multi-Modal
  Spaces
Embed Everything: A Method for Efficiently Co-Embedding Multi-Modal Spaces
Sarah Di
Robin Yu
Amol Kapoor
65
0
0
09 Oct 2021
On the Limitations of Multimodal VAEs
On the Limitations of Multimodal VAEsInternational Conference on Learning Representations (ICLR), 2021
Imant Daunhawer
Thomas M. Sutter
Kieran Chin-Cheong
Emanuele Palumbo
Julia E. Vogt
295
45
0
08 Oct 2021
3D-MOV: Audio-Visual LSTM Autoencoder for 3D Reconstruction of Multiple
  Objects from Video
3D-MOV: Audio-Visual LSTM Autoencoder for 3D Reconstruction of Multiple Objects from Video
Justin Wilson
Ming-Chia Lin
119
1
0
05 Oct 2021
Deep Neural Networks and Tabular Data: A Survey
Deep Neural Networks and Tabular Data: A Survey
V. Borisov
Tobias Leemann
Kathrin Seßler
Johannes Haug
Martin Pawelczyk
Gjergji Kasneci
LMTD
542
977
0
05 Oct 2021
Neural Dependency Coding inspired Multimodal Fusion
Neural Dependency Coding inspired Multimodal Fusion
Shiv Shankar
256
3
0
28 Sep 2021
Multimodality in Meta-Learning: A Comprehensive Survey
Multimodality in Meta-Learning: A Comprehensive Survey
Yao Ma
Shilin Zhao
Weixiao Wang
Yaoman Li
Irwin King
262
72
0
28 Sep 2021
UniMS: A Unified Framework for Multimodal Summarization with Knowledge
  Distillation
UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation
Zhengkun Zhang
Xiaojun Meng
Yasheng Wang
Xin Jiang
Qun Liu
Zhenglu Yang
174
57
0
13 Sep 2021
TEASEL: A Transformer-Based Speech-Prefixed Language Model
TEASEL: A Transformer-Based Speech-Prefixed Language Model
Mehdi Arjmand
M. Dousti
H. Moradi
144
23
0
12 Sep 2021
A Survey on Multi-modal Summarization
A Survey on Multi-modal Summarization
Anubhav Jangra
Sourajit Mukherjee
Adam Jatowt
S. Saha
M. Hasanuzzaman
206
79
0
11 Sep 2021
Multimodal Federated Learning on IoT Data
Multimodal Federated Learning on IoT DataInternational Conference on Internet-of-Things Design and Implementation (IoTDI), 2021
Yuchen Zhao
Payam Barnaghi
Hamed Haddadi
146
100
0
10 Sep 2021
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense
  in Text Generation Models
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation ModelsAAAI Conference on Artificial Intelligence (AAAI), 2021
Steven Y. Feng
Kevin Lu
Zhuofu Tao
Malihe Alikhani
Teruko Mitamura
Eduard H. Hovy
Varun Gangal
LRM
226
14
0
08 Sep 2021
Hybrid Contrastive Learning of Tri-Modal Representation for Multimodal
  Sentiment Analysis
Hybrid Contrastive Learning of Tri-Modal Representation for Multimodal Sentiment Analysis
Sijie Mai
Ying Zeng
Shuangjia Zheng
Haifeng Hu
163
187
0
04 Sep 2021
Improving Multimodal fusion via Mutual Dependency Maximisation
Improving Multimodal fusion via Mutual Dependency MaximisationConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Pierre Colombo
E. Chapuis
Matthieu Labeau
Chloé Clavel
296
33
0
31 Aug 2021
Vision-Language Navigation: A Survey and Taxonomy
Vision-Language Navigation: A Survey and Taxonomy
Wansen Wu
Tao Chang
Xinmeng Li
LM&Ro
333
47
0
26 Aug 2021
Maximum Likelihood Estimation for Multimodal Learning with Missing
  Modality
Maximum Likelihood Estimation for Multimodal Learning with Missing Modality
Fei Ma
Xiangxiang Xu
Shao-Lun Huang
Lin Zhang
215
17
0
24 Aug 2021
Detection of Illicit Drug Trafficking Events on Instagram: A Deep
  Multimodal Multilabel Learning Approach
Detection of Illicit Drug Trafficking Events on Instagram: A Deep Multimodal Multilabel Learning Approach
Chuanbo Hu
Minglei Yin
Bin Liu
Xin Li
Yanfang Ye
107
16
0
19 Aug 2021
Emotion Recognition from Multiple Modalities: Fundamentals and
  Methodologies
Emotion Recognition from Multiple Modalities: Fundamentals and Methodologies
Sicheng Zhao
Guoli Jia
Jufeng Yang
Guiguang Ding
Kurt Keutzer
221
135
0
18 Aug 2021
Affect-Aware Deep Belief Network Representations for Multimodal
  Unsupervised Deception Detection
Affect-Aware Deep Belief Network Representations for Multimodal Unsupervised Deception Detection
Leena Mathur
Maja J. Matarić
CVBM
137
8
0
17 Aug 2021
Interpretable Visual Understanding with Cognitive Attention Network
Interpretable Visual Understanding with Cognitive Attention NetworkInternational Conference on Artificial Neural Networks (ICANN), 2021
Xuejiao Tang
Wenbin Zhang
Yi Yu
Kea Turner
Hanyu Wang
Mengyu Wang
Eirini Ntoutsi
281
19
0
06 Aug 2021
Previous
123...131415...171819
Next
Page 14 of 19
Pageof 19