ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.09406
  4. Cited By
Multimodal Machine Learning: A Survey and Taxonomy
v1v2 (latest)

Multimodal Machine Learning: A Survey and Taxonomy

26 May 2017
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
ArXiv (abs)PDFHTML

Papers citing "Multimodal Machine Learning: A Survey and Taxonomy"

50 / 941 papers shown
Pan-Cancer Integrative Histology-Genomic Analysis via Interpretable
  Multimodal Deep Learning
Pan-Cancer Integrative Histology-Genomic Analysis via Interpretable Multimodal Deep LearningJournal of Pathology Informatics (J Pathol Inform), 2021
Richard J. Chen
Ming Y. Lu
Drew F. K. Williamson
Tiffany Y. Chen
Jana Lipkova
...
Maha Shady
Mane Williams
Bumjin Joo
Zahra Noor
Faisal Mahmood
139
16
0
04 Aug 2021
Exploiting BERT For Multimodal Target Sentiment Classification Through
  Input Space Translation
Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation
Zaid Khan
Y. Fu
174
178
0
03 Aug 2021
Desk Organization: Effect of Multimodal Inputs on Spatial Relational
  Learning
Desk Organization: Effect of Multimodal Inputs on Spatial Relational Learning
Ryan Rowe
Shivam Singhal
Daqing Yi
Tapomayukh Bhattacharjee
S. Srinivasa
138
3
0
03 Aug 2021
Multimodal Feature Fusion for Video Advertisements Tagging Via Stacking
  Ensemble
Multimodal Feature Fusion for Video Advertisements Tagging Via Stacking Ensemble
Qingsong Zhou
Hai Liang
Zhimin Lin
Kele Xu
171
6
0
02 Aug 2021
Multimodal Co-learning: Challenges, Applications with Datasets, Recent
  Advances and Future Directions
Multimodal Co-learning: Challenges, Applications with Datasets, Recent Advances and Future DirectionsInformation Fusion (Inf. Fusion), 2021
Anil Rahate
Rahee Walambe
S. Ramanna
K. Kotecha
402
176
0
29 Jul 2021
Squeeze-Excitation Convolutional Recurrent Neural Networks for
  Audio-Visual Scene Classification
Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene ClassificationWorkshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021
Javier Naranjo-Alcazar
Sergi Perez-Castanos
Aarón López-García
P. Zuccarello
M. Cobos
F. Ferri
84
4
0
28 Jul 2021
Adversarial Stacked Auto-Encoders for Fair Representation Learning
Adversarial Stacked Auto-Encoders for Fair Representation Learning
Patrik Kenfack
Adil Khan
Rasheed Hussain
S. M. Ahsan Kazmi
FaML
129
4
0
27 Jul 2021
Imbalanced Big Data Oversampling: Taxonomy, Algorithms, Software,
  Guidelines and Future Directions
Imbalanced Big Data Oversampling: Taxonomy, Algorithms, Software, Guidelines and Future DirectionsACM Computing Surveys (CSUR), 2021
W. Sleeman
R. Kapoor
AI4TS
222
97
0
24 Jul 2021
Multimodal Representations Learning and Adversarial Hypergraph Fusion
  for Early Alzheimer's Disease Prediction
Multimodal Representations Learning and Adversarial Hypergraph Fusion for Early Alzheimer's Disease PredictionChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2021
Qiankun Zuo
Baiying Lei
Yanyan Shen
Yong Liu
Z. Feng
Shuqiang Wang
134
55
0
21 Jul 2021
M2Lens: Visualizing and Explaining Multimodal Models for Sentiment
  Analysis
M2Lens: Visualizing and Explaining Multimodal Models for Sentiment AnalysisIEEE Transactions on Visualization and Computer Graphics (TVCG), 2021
Xingbo Wang
Jianben He
Zhihua Jin
Muqiao Yang
Yong Wang
Huamin Qu
237
93
0
17 Jul 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
294
224
0
15 Jul 2021
FairyTailor: A Multimodal Generative Framework for Storytelling
FairyTailor: A Multimodal Generative Framework for Storytelling
Eden Bensaid
Mauro Martino
Benjamin Hoover
Hendrik Strobelt
LRM
177
24
0
13 Jul 2021
Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory
Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory
Xuejiao Tang
Xin Huang
Wenbin Zhang
T. Child
Qiong Hu
Zhen Liu
Ji Zhang
LRM
201
20
0
04 Jul 2021
Multimodal Representation for Neural Code Search
Multimodal Representation for Neural Code Search
Jian Gu
Zimin Chen
Monperrus Martin
162
51
0
02 Jul 2021
Case Relation Transformer: A Crossmodal Language Generation Model for
  Fetching Instructions
Case Relation Transformer: A Crossmodal Language Generation Model for Fetching Instructions
Motonari Kambara
K. Sugiura
ViT
150
6
0
02 Jul 2021
Towards Model-informed Precision Dosing with Expert-in-the-loop Machine
  Learning
Towards Model-informed Precision Dosing with Expert-in-the-loop Machine LearningIEEE International Conference on Information Reuse and Integration (IRI), 2021
Yihuang Kang
Y. Chiu
Ming-Yen Lin
F. Su
Sheng-Tai Huang
133
2
0
28 Jun 2021
Deep Learning for Technical Document Classification
Deep Learning for Technical Document ClassificationIEEE transactions on engineering management (IEEE Trans. Eng. Manage.), 2021
Shuo Jiang
Jie Hu
C. Magee
Jianxi Luo
252
61
0
27 Jun 2021
Core Challenges in Embodied Vision-Language Planning
Core Challenges in Embodied Vision-Language PlanningJournal of Artificial Intelligence Research (JAIR), 2021
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
544
58
0
26 Jun 2021
Learning Language and Multimodal Privacy-Preserving Markers of Mood from
  Mobile Data
Learning Language and Multimodal Privacy-Preserving Markers of Mood from Mobile Data
Paul Pu Liang
Terrance Liu
Anna Cai
Michal Muszynski
Ryo Ishii
Nicholas B. Allen
Randy P. Auerbach
David Brent
Ruslan Salakhutdinov
Louis-Philippe Morency
212
19
0
24 Jun 2021
DravidianMultiModality: A Dataset for Multi-modal Sentiment Analysis in
  Tamil and Malayalam
DravidianMultiModality: A Dataset for Multi-modal Sentiment Analysis in Tamil and Malayalam
Bharathi Raja Chakravarthi
K. JishnuParameswaranP.
B. Premjith
Kritik Soman
Rahul Ponnusamy
Prasanna Kumar Kumaresan
K. Thamburaj
John P. Mccrae
88
24
0
09 Jun 2021
What Makes Multi-modal Learning Better than Single (Provably)
What Makes Multi-modal Learning Better than Single (Provably)Neural Information Processing Systems (NeurIPS), 2021
Yu Huang
Chenzhuang Du
Zihui Xue
Xuanyao Chen
Hang Zhao
Longbo Huang
290
339
0
08 Jun 2021
Exploring modality-agnostic representations for music classification
Exploring modality-agnostic representations for music classification
Ho-Hsiang Wu
Magdalena Fuentes
J. P. Bello
248
4
0
02 Jun 2021
Rethinking the constraints of multimodal fusion: case study in
  Weakly-Supervised Audio-Visual Video Parsing
Rethinking the constraints of multimodal fusion: case study in Weakly-Supervised Audio-Visual Video Parsing
Jianning Wu
Zhuqing Jiang
S. Wen
Aidong Men
Haiying Wang
228
1
0
30 May 2021
Self-Supervised Multimodal Opinion Summarization
Self-Supervised Multimodal Opinion SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Jinbae Im
Moonki Kim
Hoyeop Lee
Hyunsouk Cho
Sehee Chung
124
38
0
27 May 2021
Recent Advances and Trends in Multimodal Deep Learning: A Review
Recent Advances and Trends in Multimodal Deep Learning: A Review
Jabeen Summaira
Xi Li
Amin Muhammad Shoib
Songyuan Li
Abdul Jabbar
HAI
340
71
0
24 May 2021
A Review on Explainability in Multimodal Deep Neural Nets
A Review on Explainability in Multimodal Deep Neural NetsIEEE Access (IEEE Access), 2021
Gargi Joshi
Rahee Walambe
K. Kotecha
402
171
0
17 May 2021
VSR: A Unified Framework for Document Layout Analysis combining Vision,
  Semantics and Relations
VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and RelationsIEEE International Conference on Document Analysis and Recognition (ICDAR), 2021
Peng Zhang
Can Li
Liang Qiao
Zhanzhan Cheng
Shiliang Pu
Yi Niu
Leilei Gan
166
68
0
13 May 2021
Relation-aware Hierarchical Attention Framework for Video Question
  Answering
Relation-aware Hierarchical Attention Framework for Video Question AnsweringInternational Conference on Multimedia Retrieval (ICMR), 2021
Fangtao Li
Ting Bai
Chenyu Cao
Zihe Liu
C. Yan
Bin Wu
223
14
0
13 May 2021
Cross-Modal and Multimodal Data Analysis Based on Functional Mapping of
  Spectral Descriptors and Manifold Regularization
Cross-Modal and Multimodal Data Analysis Based on Functional Mapping of Spectral Descriptors and Manifold Regularization
M. Behmanesh
Peyman Adibi
Jocelyn Chanussot
Sayyed Mohammad Saeed Ehsani
143
3
0
12 May 2021
Including Signed Languages in Natural Language Processing
Including Signed Languages in Natural Language ProcessingAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Kayo Yin
Amit Moryossef
J. Hochgesang
Yoav Goldberg
Malihe Alikhani
207
131
0
11 May 2021
Cross-Modal Generative Augmentation for Visual Question Answering
Cross-Modal Generative Augmentation for Visual Question AnsweringBritish Machine Vision Conference (BMVC), 2021
Zixu Wang
Yishu Miao
Lucia Specia
220
11
0
11 May 2021
Graph Inference Representation: Learning Graph Positional Embeddings
  with Anchor Path Encoding
Graph Inference Representation: Learning Graph Positional Embeddings with Anchor Path Encoding
Yuheng Lu
Jinpeng Chen
Chuxiong Sun
Jie Hu
GNN
104
2
0
09 May 2021
Blockchain Systems, Technologies and Applications: A Methodology
  Perspective
Blockchain Systems, Technologies and Applications: A Methodology PerspectiveIEEE Communications Surveys and Tutorials (COMST), 2021
Bin Cao
Zixin Wang
Long Zhang
Daquan Feng
M. Peng
Lei Zhang
184
84
0
08 May 2021
Generalized Multimodal ELBO
Generalized Multimodal ELBOInternational Conference on Learning Representations (ICLR), 2021
Thomas M. Sutter
Imant Daunhawer
Julia E. Vogt
298
120
0
06 May 2021
Watershed of Artificial Intelligence: Human Intelligence, Machine
  Intelligence, and Biological Intelligence
Watershed of Artificial Intelligence: Human Intelligence, Machine Intelligence, and Biological Intelligence
Weigang Li
L. Enamoto
Denise Leyi Li
G. P. R. Filho
VLM
117
0
0
27 Apr 2021
Multi-view Deep One-class Classification: A Systematic Exploration
Multi-view Deep One-class Classification: A Systematic Exploration
Siqi Wang
Jiyuan Liu
Guang Yu
Xinwang Liu
Sihang Zhou
En Zhu
Yuexiang Yang
Jianping Yin
96
1
0
27 Apr 2021
Weakly-supervised Multi-task Learning for Multimodal Affect Recognition
Weakly-supervised Multi-task Learning for Multimodal Affect Recognition
Wenliang Dai
Samuel Cahyawijaya
Yejin Bang
Pascale Fung
CVBM
170
12
0
23 Apr 2021
Literature review on vulnerability detection using NLP technology
Literature review on vulnerability detection using NLP technology
Jiajie Wu
384
16
0
23 Apr 2021
Uncertainty-Aware Boosted Ensembling in Multi-Modal Settings
Uncertainty-Aware Boosted Ensembling in Multi-Modal SettingsIEEE International Joint Conference on Neural Network (IJCNN), 2021
U. Sarawgi
Rishab Khincha
W. Zulfikar
Satrajit S. Ghosh
Pattie Maes
UQCV
197
7
0
21 Apr 2021
Continual learning in cross-modal retrieval
Continual learning in cross-modal retrieval
Kai Wang
Luis Herranz
Joost van de Weijer
CLL
154
18
0
14 Apr 2021
Adversarial Sticker: A Stealthy Attack Method in the Physical World
Adversarial Sticker: A Stealthy Attack Method in the Physical WorldIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Xingxing Wei
Yingjie Guo
Jie Yu
AAML
272
163
0
14 Apr 2021
Integrating Information Theory and Adversarial Learning for Cross-modal
  Retrieval
Integrating Information Theory and Adversarial Learning for Cross-modal RetrievalPattern Recognition (Pattern Recogn.), 2021
Wei Chen
Yu Liu
E. Bakker
M. Lew
GAN
124
30
0
11 Apr 2021
Software/Hardware Co-design for Multi-modal Multi-task Learning in
  Autonomous Systems
Software/Hardware Co-design for Multi-modal Multi-task Learning in Autonomous SystemsInternational Conference on Artificial Intelligence Circuits and Systems (ICAICS), 2021
Cong Hao
Deming Chen
257
23
0
08 Apr 2021
Synthesis of Compositional Animations from Textual Descriptions
Synthesis of Compositional Animations from Textual DescriptionsIEEE International Conference on Computer Vision (ICCV), 2021
Anindita Ghosh
N. Cheema
Cennet Oguz
Christian Theobalt
P. Slusallek
581
214
0
26 Mar 2021
Audio Description from Image by Modal Translation Network
Audio Description from Image by Modal Translation NetworkNeurocomputing (Neurocomputing), 2021
Hailong Ning
Xiangtao Zheng
Yuan Yuan
Xiaoqiang Lu
DiffM
127
18
0
18 Mar 2021
Multimodal End-to-End Sparse Model for Emotion Recognition
Multimodal End-to-End Sparse Model for Emotion RecognitionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Wenliang Dai
Samuel Cahyawijaya
Zihan Liu
Pascale Fung
CVBM
243
101
0
17 Mar 2021
Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion
  Recognition
Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion RecognitionPattern Recognition Letters (PR), 2021
Liam Schoneveld
Alice Othmani
Hazem Abdelkawy
254
203
0
16 Mar 2021
Reconsidering Representation Alignment for Multi-view Clustering
Reconsidering Representation Alignment for Multi-view ClusteringComputer Vision and Pattern Recognition (CVPR), 2021
Daniel J. Trosten
Sigurd Løkse
Robert Jenssen
Michael C. Kampffmeyer
197
180
0
13 Mar 2021
Orthogonalized Kernel Debiased Machine Learning for Multimodal Data
  Analysis
Orthogonalized Kernel Debiased Machine Learning for Multimodal Data Analysis
Xiaowu Dai
Lexin Li
226
16
0
12 Mar 2021
What is Multimodality?
What is Multimodality?
Letitia Parcalabescu
Nils Trost
Anette Frank
230
0
0
10 Mar 2021
Previous
123...141516171819
Next
Page 15 of 19
Pageof 19