v1v2 (latest)

Multimodal Machine Learning: A Survey and Taxonomy

26 May 2017

T. Baltrušaitis

Chaitanya Ahuja

Louis-Philippe Morency

ArXiv (abs)PDF HTML

Papers citing "Multimodal Machine Learning: A Survey and Taxonomy"

50 / 941 papers shown

What can knowledge graph alignment gain with Neuro-Symbolic learning approaches?

P. Cotovio

Ernesto Jiménez-Ruiz

Catia Pesquita

185

11 Oct 2023

IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-trainingIEEE Transactions on Medical Imaging (TMI), 2023

325

11 Oct 2023

What Makes for Robust Multi-Modal Models in the Face of Missing Modalities?

Hang Zhao

202

10 Oct 2023

Robust Multimodal Learning with Missing Modalities via Parameter-Efficient AdaptationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Md Kaykobad Reza

Ashley Prater-Bennette

M. Salman Asif

308

06 Oct 2023

Stand for Something or Fall for Everything: Predict Misinformation Spread with Stance-Aware Graph Neural NetworksInternational Conference on Interaction Sciences (ICIS), 2023

166

04 Oct 2023

Modularity in Deep Learning: A Survey

Haozhe Sun

Isabelle Guyon

MoMe

313

02 Oct 2023

GRID: A Platform for General Robot Intelligence Development

271

02 Oct 2023

GeRA: Label-Efficient Geometrically Regularized Alignment

315

01 Oct 2023

MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph DataNeural Information Processing Systems (NeurIPS), 2023

285

29 Sep 2023

XVO: Generalized Visual Odometry via Cross-Modal Self-TrainingIEEE International Conference on Computer Vision (ICCV), 2023

Tohida Rehman

Ronit Mandal

Jimuyang Zhang

Debarshi Kumar Sanyal

SSL

363

28 Sep 2023

Harnessing Diverse Data for Global Disaster Prediction: A Multimodal Framework

Gengyin Liu

Huaiyang Zhong

AI4CE

28 Sep 2023

SeMAnD: Self-Supervised Anomaly Detection in Multimodal Geospatial Datasets

210

26 Sep 2023

Divide and Conquer in Video Anomaly Detection: A Comprehensive Review and New ApproachACM Cloud and Autonomic Computing Conference (CAC), 2023

Jian Xiao

Tianyuan Liu

G. Ji

290

26 Sep 2023

MultiModN- Multimodal, Multi-Task, Interpretable Modular NetworksNeural Information Processing Systems (NeurIPS), 2023

238

25 Sep 2023

A Survey on Image-text Multimodal Models

Ruifeng Guo

Jingxuan Wei

Linzhuang Sun

Khai-Nguyen Nguyen

Guiyong Chang

Dawei Liu

Sibo Zhang

Zhengbing Yao

Mingjun Xu

Liping Bu

VLM

320

23 Sep 2023

Impact of architecture on robustness and interpretability of multispectral deep neural networks

218

21 Sep 2023

A Theory of Multimodal LearningNeural Information Processing Systems (NeurIPS), 2023

Zhou Lu

221

21 Sep 2023

Synth-AC: Enhancing Audio Captioning with Synthetic Supervision

168

18 Sep 2023

Bias and Fairness in Chatbots: An OverviewAPSIPA Transactions on Signal and Information Processing (TASIP), 2023

314

16 Sep 2023

VulnSense: Efficient Vulnerability Detection in Ethereum Smart Contracts by Multimodal Learning with Graph Neural Network and Language Model

144

15 Sep 2023

One-stage Modality Distillation for Incomplete Multimodal Learning

Shicai Wei

Yang Luo

Chunbo Luo

211

15 Sep 2023

Prompting Segmentation with Sound Is Generalizable Audio-Visual Source LocalizerAAAI Conference on Artificial Intelligence (AAAI), 2023

Xi Li

304

13 Sep 2023

M(otion)-mode Based Prediction of Ejection Fraction using Echocardiograms

166

07 Sep 2023

Enhancing Deep Learning Models through Tensorization: A Comprehensive Survey and Framework

Manal Helal

244

05 Sep 2023

Exchanging-based Multimodal Fusion with Transformer

Xiang Li

188

05 Sep 2023

A Survey on Interpretable Cross-modal Reasoning

400

05 Sep 2023

LoRA-like Calibration for Multimodal Deception Detection using ATSFace DataBigData Congress [Services Society] (BSS), 2023

Shun-Wen Hsiao

Chengbin Sun

CVBM

04 Sep 2023

End-to-End Learning on Multimodal Knowledge Graphs

168

03 Sep 2023

Towards Contrastive Learning in Music Video Domain

210

01 Sep 2023

Spoken Language Intelligence of Large Language Models for Language Learning

285

28 Aug 2023

TriGait: Aligning and Fusing Skeleton and Silhouette Gait Data via a Tri-Branch Network

274

25 Aug 2023

SkipcrossNets: Adaptive Skip-cross Fusion for Road DetectionAutomotive Innovation (AUIN), 2023

Jun Li

172

24 Aug 2023

ALIP: Adaptive Language-Image Pre-training with Synthetic CaptionIEEE International Conference on Computer Vision (ICCV), 2023

Xiang An

223

16 Aug 2023

Boosting Multi-modal Model Performance with Adaptive Gradient ModulationIEEE International Conference on Computer Vision (ICCV), 2023

244

15 Aug 2023

AIGC In China: Current Developments And Future Outlook

Xiangyu Li

Yuqing Fan

S. Cheng

184

14 Aug 2023

Deep convolutional neural networks for cyclic sensor data

118

14 Aug 2023

Multimodality and Attention Increase Alignment in Natural Language Prediction Between Humans and Computational Models

Quitterie Lacome DEstalenx

Akilles Rechardt

Jeremy I. Skipper

G. Vigliocco

248

11 Aug 2023

Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey

279

09 Aug 2023

Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion RecognitionACM Multimedia (ACM MM), 2023

Hao Fei

198

08 Aug 2023

Dual input neural networks for positional sound source localizationEURASIP Journal on Audio, Speech, and Music Processing (EURASIP J. Audio Speech Music Process), 2023

Eric Grinstein

Vincent W. Neo

Patrick A. Naylor

151

08 Aug 2023

Multimodal machine learning for materials science: composition-structure bimodal learning for experimentally measured properties

127

04 Aug 2023

Contrastive Conditional Latent Diffusion for Audio-visual SegmentationIEEE Transactions on Image Processing (IEEE TIP), 2023

382

31 Jul 2023

Synaptic Plasticity Models and Bio-Inspired Unsupervised Deep Learning: A Survey

237

30 Jul 2023

General Purpose Artificial Intelligence Systems (GPAIS): Properties, Definition, Taxonomy, Societal Implications and Responsible Governance

427

26 Jul 2023

Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detectionIEEE International Conference on Computer Vision (ICCV), 2023

303

25 Jul 2023

Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature AlignmentIEEE International Conference on Computer Vision (ICCV), 2023

283

24 Jul 2023

Robust Visual Question Answering: Datasets, Methods, and Future ChallengesIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Pinghui Wang

Jun Liu

333

21 Jul 2023

MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection SystemAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Ruifeng Xu

172

14 Jul 2023

MaxCorrMGNN: A Multi-Graph Neural Network Framework for Generalized Multimodal Fusion of Medical Data for Outcome Prediction

N. S. D'Souza

Hongzhi Wang

Andrea Giovannini

A. Foncubierta-Rodríguez

Kristen L. Beck

Orest Boyko

Tanveer Syeda-Mahmood

107

13 Jul 2023

Learning Fine Pinch-Grasp Skills using Tactile Sensing from A Few Real-world Demonstrations

226

10 Jul 2023