Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1705.09406
Cited By
v1
v2 (latest)
Multimodal Machine Learning: A Survey and Taxonomy
26 May 2017
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multimodal Machine Learning: A Survey and Taxonomy"
50 / 941 papers shown
What can knowledge graph alignment gain with Neuro-Symbolic learning approaches?
P. Cotovio
Ernesto Jiménez-Ruiz
Catia Pesquita
185
1
0
11 Oct 2023
IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training
IEEE Transactions on Medical Imaging (TMI), 2023
Che Liu
Sibo Cheng
Miaojing Shi
Anand Shah
Wenjia Bai
Rossella Arcucci
325
36
0
11 Oct 2023
What Makes for Robust Multi-Modal Models in the Face of Missing Modalities?
Siting Li
Chenzhuang Du
Yue Zhao
Yu Huang
Hang Zhao
202
6
0
10 Oct 2023
Robust Multimodal Learning with Missing Modalities via Parameter-Efficient Adaptation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Md Kaykobad Reza
Ashley Prater-Bennette
M. Salman Asif
308
24
0
06 Oct 2023
Stand for Something or Fall for Everything: Predict Misinformation Spread with Stance-Aware Graph Neural Networks
International Conference on Interaction Sciences (ICIS), 2023
Zihan Chen
Jingyi Sun
Rong Liu
Feng Mai
166
2
0
04 Oct 2023
Modularity in Deep Learning: A Survey
Haozhe Sun
Isabelle Guyon
MoMe
313
5
0
02 Oct 2023
GRID: A Platform for General Robot Intelligence Development
Sai H. Vemprala
Shuhang Chen
Abhinav Shukla
Dinesh Narayanan
Ashish Kapoor
271
11
0
02 Oct 2023
GeRA: Label-Efficient Geometrically Regularized Alignment
Dustin Klebe
Tal Shnitzer
Mikhail Yurochkin
Leonid Karlinsky
Justin Solomon
315
2
0
01 Oct 2023
MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data
Neural Information Processing Systems (NeurIPS), 2023
Tianyu Liu
Yuge Wang
Rex Ying
Hongyu Zhao
285
31
0
29 Sep 2023
XVO: Generalized Visual Odometry via Cross-Modal Self-Training
IEEE International Conference on Computer Vision (ICCV), 2023
Tohida Rehman
Ronit Mandal
Jimuyang Zhang
Debarshi Kumar Sanyal
SSL
363
25
0
28 Sep 2023
Harnessing Diverse Data for Global Disaster Prediction: A Multimodal Framework
Gengyin Liu
Huaiyang Zhong
AI4CE
87
1
0
28 Sep 2023
SeMAnD: Self-Supervised Anomaly Detection in Multimodal Geospatial Datasets
Daria Reshetova
Swetava Ganguli
C. V. K. Iyer
Vipul Pandey
210
4
0
26 Sep 2023
Divide and Conquer in Video Anomaly Detection: A Comprehensive Review and New Approach
ACM Cloud and Autonomic Computing Conference (CAC), 2023
Jian Xiao
Tianyuan Liu
G. Ji
290
7
0
26 Sep 2023
MultiModN- Multimodal, Multi-Task, Interpretable Modular Networks
Neural Information Processing Systems (NeurIPS), 2023
Vinitra Swamy
Malika Satayeva
Jibril Frej
Thierry Bossy
Thijs Vogels
Martin Jaggi
Tanja Käser
Mary-Anne Hartley
238
20
0
25 Sep 2023
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai-Nguyen Nguyen
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
320
22
0
23 Sep 2023
Impact of architecture on robustness and interpretability of multispectral deep neural networks
Charles Godfrey
Elise Bishoff
Myles Mckay
E. Byler
218
1
0
21 Sep 2023
A Theory of Multimodal Learning
Neural Information Processing Systems (NeurIPS), 2023
Zhou Lu
221
30
0
21 Sep 2023
Synth-AC: Enhancing Audio Captioning with Synthetic Supervision
Feiyang Xiao
Qiaoxi Zhu
Jian Guan
Xubo Liu
Haohe Liu
Kejia Zhang
Wenwu Wang
168
2
0
18 Sep 2023
Bias and Fairness in Chatbots: An Overview
APSIPA Transactions on Signal and Information Processing (TASIP), 2023
Jintang Xue
Yun Cheng Wang
Chengwei Wei
Xiaofeng Liu
Jonghye Woo
C.-C. Jay Kuo
314
58
0
16 Sep 2023
VulnSense: Efficient Vulnerability Detection in Ethereum Smart Contracts by Multimodal Learning with Graph Neural Network and Language Model
Phan The Duy
Nghi Hoang Khoa
N. H. Quyen
Le Cong Trinh
V. Kiên
Trinh Minh Hoang
V. Pham
144
25
0
15 Sep 2023
One-stage Modality Distillation for Incomplete Multimodal Learning
Shicai Wei
Yang Luo
Chunbo Luo
211
1
0
15 Sep 2023
Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yaoting Wang
Weisong Liu
Guangyao Li
Jian Ding
Di Hu
Xi Li
VLM
304
38
0
13 Sep 2023
M(otion)-mode Based Prediction of Ejection Fraction using Echocardiograms
Ece Ozkan
Thomas M. Sutter
Yurong Hu
S. Balzer
Julia E. Vogt
166
0
0
07 Sep 2023
Enhancing Deep Learning Models through Tensorization: A Comprehensive Survey and Framework
Manal Helal
244
0
0
05 Sep 2023
Exchanging-based Multimodal Fusion with Transformer
Renyu Zhu
Chengcheng Han
Yong Qian
Qiushi Sun
Xiang Li
Ming Gao
Xuezhi Cao
Yunsen Xian
188
5
0
05 Sep 2023
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
400
5
0
05 Sep 2023
LoRA-like Calibration for Multimodal Deception Detection using ATSFace Data
BigData Congress [Services Society] (BSS), 2023
Shun-Wen Hsiao
Chengbin Sun
CVBM
96
1
0
04 Sep 2023
End-to-End Learning on Multimodal Knowledge Graphs
Xander Wilcke
Peter Bloem
Victor de Boer
R. V. Veer
168
9
0
03 Sep 2023
Towards Contrastive Learning in Music Video Domain
Karel Veldkamp
Mariya Hendriksen
Zoltán Szlávik
Alexander Keijser
SSL
210
3
0
01 Sep 2023
Spoken Language Intelligence of Large Language Models for Language Learning
Linkai Peng
Baorian Nuchged
Yingming Gao
ELM
285
5
0
28 Aug 2023
TriGait: Aligning and Fusing Skeleton and Silhouette Gait Data via a Tri-Branch Network
Yangwei Sun
Xu Feng
Liyan Ma
Long Hu
Mark Nixon
CVBM
274
9
0
25 Aug 2023
SkipcrossNets: Adaptive Skip-cross Fusion for Road Detection
Automotive Innovation (AUIN), 2023
Yan Gong
Xinyu Zhang
Hao Liu
Xinmin Jiang
Zhiwei Li
Xinchen Gao
Lei Lin
Dafeng Jin
Jun Li
Huaping Liu
3DPC
172
12
0
24 Aug 2023
ALIP: Adaptive Language-Image Pre-training with Synthetic Caption
IEEE International Conference on Computer Vision (ICCV), 2023
Kaicheng Yang
Jiankang Deng
Xiang An
Jiawei Li
Ziyong Feng
Jia Guo
Jing Yang
Tongliang Liu
VLM
CLIP
223
81
0
16 Aug 2023
Boosting Multi-modal Model Performance with Adaptive Gradient Modulation
IEEE International Conference on Computer Vision (ICCV), 2023
Hong Li
Xingyu Li
Pengbo Hu
Yinuo Lei
Chunxiao Li
Yi Zhou
244
65
0
15 Aug 2023
AIGC In China: Current Developments And Future Outlook
Xiangyu Li
Yuqing Fan
S. Cheng
184
13
0
14 Aug 2023
Deep convolutional neural networks for cyclic sensor data
P. Goodarzi
Y. Robin
A. Schütze
T. Schneider
118
0
0
14 Aug 2023
Multimodality and Attention Increase Alignment in Natural Language Prediction Between Humans and Computational Models
V. Kewenig
Andrew Lampinen
Samuel A. Nastase
Christopher Edwards
Quitterie Lacome DEstalenx
Akilles Rechardt
Jeremy I. Skipper
G. Vigliocco
248
3
0
11 Aug 2023
Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey
Liping Wang
Jiawei Li
Lifan Zhao
Zhizhuo Kou
Xiaohan Wang
Xinyi Zhu
Hao Wang
Yanyan Shen
Lei Chen
AIFin
279
11
0
09 Aug 2023
Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion Recognition
ACM Multimedia (ACM MM), 2023
Bobo Li
Hao Fei
Lizi Liao
Yu Zhao
Chong Teng
Tat-Seng Chua
Donghong Ji
Fei Li
198
57
0
08 Aug 2023
Dual input neural networks for positional sound source localization
EURASIP Journal on Audio, Speech, and Music Processing (EURASIP J. Audio Speech Music Process), 2023
Eric Grinstein
Vincent W. Neo
Patrick A. Naylor
151
6
0
08 Aug 2023
Multimodal machine learning for materials science: composition-structure bimodal learning for experimentally measured properties
Sheng Gong
Shuo Wang
Taishan Zhu
Y. Shao-horn
Jeffrey C. Grossman
127
3
0
04 Aug 2023
Contrastive Conditional Latent Diffusion for Audio-visual Segmentation
IEEE Transactions on Image Processing (IEEE TIP), 2023
Yuxin Mao
Jing Zhang
Mochu Xiang
Yun-Qiu Lv
Dong Li
Yiran Zhong
Yuchao Dai
DiffM
382
41
0
31 Jul 2023
Synaptic Plasticity Models and Bio-Inspired Unsupervised Deep Learning: A Survey
Gabriele Lagani
Fabrizio Falchi
Claudio Gennaro
Giuseppe Amato
AAML
237
9
0
30 Jul 2023
General Purpose Artificial Intelligence Systems (GPAIS): Properties, Definition, Taxonomy, Societal Implications and Responsible Governance
I. Triguero
Daniel Molina
Javier Poyatos
Javier Del Ser
Francisco Herrera
AI4TS
AI4MH
427
6
0
26 Jul 2023
Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection
IEEE International Conference on Computer Vision (ICCV), 2023
Yichao Cao
Qingfei Tang
Fengyuan Yang
Xiu Su
Shan You
Xiaobo Lu
Chang Xu
303
29
0
25 Jul 2023
Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
IEEE International Conference on Computer Vision (ICCV), 2023
Sarah Ibrahimi
Xiaohang Sun
Pichao Wang
Amanmeet Garg
Ashutosh Sanan
Mohamed Omar
283
33
0
24 Jul 2023
Robust Visual Question Answering: Datasets, Methods, and Future Challenges
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jie Ma
Pinghui Wang
Dechen Kong
Zewei Wang
Jun Liu
Hongbin Pei
Junzhou Zhao
OOD
333
45
0
21 Jul 2023
MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Libo Qin
Shijue Huang
Qiguang Chen
Chenran Cai
Yudi Zhang
Bin Liang
Wanxiang Che
Ruifeng Xu
172
65
0
14 Jul 2023
MaxCorrMGNN: A Multi-Graph Neural Network Framework for Generalized Multimodal Fusion of Medical Data for Outcome Prediction
N. S. D'Souza
Hongzhi Wang
Andrea Giovannini
A. Foncubierta-Rodríguez
Kristen L. Beck
Orest Boyko
Tanveer Syeda-Mahmood
107
6
0
13 Jul 2023
Learning Fine Pinch-Grasp Skills using Tactile Sensing from A Few Real-world Demonstrations
Xiaofeng Mao
Yucheng Xu
Ruoshi Wen
Mohammadreza Kasaei
Wanming Yu
Efi Psomopoulou
Nathan Lepora
Zhibin Li
SSL
226
1
0
10 Jul 2023
Previous
1
2
3
...
7
8
9
...
17
18
19
Next