v1v2 (latest)

Multimodal Machine Learning: A Survey and Taxonomy

26 May 2017

T. Baltrušaitis

Chaitanya Ahuja

Louis-Philippe Morency

ArXiv (abs)PDF HTML

Papers citing "Multimodal Machine Learning: A Survey and Taxonomy"

50 / 941 papers shown

Title
Improving Speech Emotion Recognition with Mutual Information Regularized Generative Model Chung-Soo Ahn R. Rana Sunil Sivadas Carlos Busso Jagath Rajapakse 133 0 0 24 Dec 2025
FDRMFL:Multi-modal Federated Feature Extraction Model Based on Information Maximization and Contrastive Learning Haozhe Wu 28 0 0 30 Nov 2025
RecruitView: A Multimodal Dataset for Predicting Personality and Interview Performance for Human Resources Applications Amit Kumar Gupta Farhan Sheth Hammad Shaikh Dheeraj Kumar Angkul Puniya Deepak Panwar Sandeep Chaurasia Priya Mathur 69 0 0 29 Nov 2025
Fusion or Confusion? Assessing the impact of visible-thermal image fusion for automated wildlife detection Camille Dionne-Pierre Samuel Foucher Jérôme Théau Jérôme Lemaître Patrick Charbonneau Maxime Brousseau Mathieu Varin 24 0 0 27 Nov 2025
Advanced Data Collection Techniques in Cloud Security: A Multi-Modal Deep Learning Autoencoder Approach Aamiruddin Syed Mohammed Ilyas Ahmad 28 0 0 26 Nov 2025
GazeProphetV2: Head-Movement-Based Gaze Prediction Enabling Efficient Foveated Rendering on Mobile VR Farhaan Ebadulla Chiraag Mudlpaur Shreya Chaurasia Gaurav BV 64 0 0 25 Nov 2025
Crash-Consistent Checkpointing for AI Training on macOS/APFS Juha Jeon 48 0 0 23 Nov 2025
Social-Media Based Personas Challenge: Hybrid Prediction of Common and Rare User Actions on Bluesky Benjamin White Anastasia Shimorina 84 0 0 21 Nov 2025
Affective Multimodal Agents with Proactive Knowledge Grounding for Emotionally Aligned Marketing Dialogue Lin Yu Xiaofei Han Yifei Kang Chiung-Yi Tseng Danyang Zhang Ziqian Bi Zhimo Han 32 0 0 21 Nov 2025
Multimodal Wireless Foundation Models Ahmed Aboulfotouh Hatem Abou-Zeid 136 0 0 19 Nov 2025
Cross-Modal Consistency-Guided Active Learning for Affective BCI Systems Hyo-Jeong Jang Hye-Bin Shin Kang Yin 117 0 0 19 Nov 2025
Uncertainty-Resilient Multimodal Learning via Consistency-Guided Cross-Modal Transfer Hyo-Jeong Jang 77 0 0 18 Nov 2025
Dual-Pathway Fusion of EHRs and Knowledge Graphs for Predicting Unseen Drug-Drug Interactions Franklin Lee Tengfei Ma 128 0 0 10 Nov 2025
Countering Multi-modal Representation Collapse through Rank-targeted Fusion Seulgi Kim Kiran Kokilepersaud Mohit Prabhushankar Ghassan AlRegib 108 0 0 09 Nov 2025
MULTIBENCH++: A Unified and Comprehensive Multimodal Fusion Benchmarking Across Specialized Domains Leyan Xue Zongbo Han Kecheng Xue Xiaohong Liu Guangyu Wang C. Zhang 120 0 0 09 Nov 2025
QuAnTS: Question Answering on Time Series Felix Divo Maurice Kraus Anh Q. Nguyen Hao Xue Imran Razzak Flora D. Salim Kristian Kersting Devendra Singh Dhami 88 0 0 07 Nov 2025
Caption Injection for Optimization in Generative Search Engine Xiaolu Chen Yong Liao DiffM 116 0 0 06 Nov 2025
Emotion Recognition in Multi-Speaker Conversations through Speaker Identification, Knowledge Distillation, and Hierarchical Fusion Xiao Li Kotaro Funakoshi Manabu Okumura 76 0 0 05 Nov 2025
Integrating Visual and X-Ray Machine Learning Features in the Study of Paintings by Goya Hassan Ugail Ismail Lujain Jaleel 48 0 0 02 Nov 2025
Balanced Multimodal Learning via Mutual Information Rongrong Xie Guido Sanguinetti 100 0 0 02 Nov 2025
Interaction as Intelligence Part II: Asynchronous Human-Agent Rollout for Long-Horizon Task Training Dayuan Fu Yunze Wu Xiaojie Cai Lyumanshan Ye Shijie Xia ... Junfei Wang Qishuo Hua Pengrui Lu Yang Xiao Pengfei Liu 192 0 0 31 Oct 2025
Context Engineering 2.0: The Context of Context Engineering Qishuo Hua Lyumanshan Ye Dayuan Fu Yang Xiao Xiaojie Cai Yunze Wu Jifan Lin Junfei Wang Pengfei Liu 361 2 0 30 Oct 2025
Multimodal Negative Learning Baoquan Gong X. Gao Q. Hu Qinghua Hu Bing Cao 100 1 0 23 Oct 2025
FrogDeepSDM: Improving Frog Counting and Occurrence Prediction Using Multimodal Data and Pseudo-Absence Imputation C. Padubidri Pranesh Velmurugan Andreas Lanitis A. Kamilaris 81 0 0 22 Oct 2025
Multi-modal Co-learning for Earth Observation: Enhancing single-modality models via modality collaboration Francisco Mena Dino Ienco C. Dantas R. Interdonato Andreas Dengel 107 1 0 22 Oct 2025
Rebellious Student: A Complementary Learning Framework for Background Feature Enhancement in Hyperspectral Anomaly Detection WenPing Jin Yuyang Tang Li Zhu Fei-Yu Guo 131 0 0 21 Oct 2025
Towards a Generalizable Fusion Architecture for Multimodal Object Detection Jad Berjawi Yoann Dupas Christophe Cérin 85 0 0 20 Oct 2025
MILES: Modality-Informed Learning Rate Scheduler for Balancing Multimodal Learning Alejandro Guerra-Manzanares Farah E. Shamout 116 0 0 20 Oct 2025
Quantifying Multimodal Imbalance: A GMM-Guided Adaptive Loss for Audio-Visual Learning Zhaocheng Liu Zhiwen Yu Xiaoqing Liu 176 0 0 20 Oct 2025
Contrastive Dimension Reduction: A Systematic Review Sam Hawke Eric Zhang Jiawen Chen Didong Li 148 1 0 13 Oct 2025
MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing RatesPattern Recognition (Pattern Recogn.), 2025 Binyu Zhao Wei Zhang Zhaonian Zou 141 0 0 12 Oct 2025
A Multimodal Approach to SME Credit Scoring Integrating Transaction and Ownership Networks Sahab Zandi Kamesh Korangi Juan C. Moreno-Paredes María Óskarsdóttir Christophe Mues Cristián Bravo 120 0 0 10 Oct 2025
Lyapunov-Stable Adaptive Control for Multimodal Concept Drift Tianyu Bell Pan Mengdi Zhu Alexa Jordyn Cole Ronald Wilson D. Woodard 120 0 0 09 Oct 2025
Towards Neurocognitive-Inspired Intelligence: From AI's Structural Mimicry to Human-Like Functional Cognition Noorbakhsh Amiri Golilarz Hassan S. Al Khatib Shahram Rahimi 121 0 0 09 Oct 2025
MLLM4TS: Leveraging Vision and Multimodal Language Models for General Time-Series Analysis Qinghua Liu Sam Heshmati Zheda Mai Zubin Abraham John Paparrizos Liu Ren AI4TS 131 1 0 08 Oct 2025
Expressive and Scalable Quantum Fusion for Multimodal Learning T. Nguyen Trong Nghia Hoang Phi Le Nguyen Hai L. Vu Truong Cong Thang 130 0 0 08 Oct 2025
Partial Information Decomposition via Normalizing Flows in Latent Gaussian Distributions Wenyuan Zhao Adithya Balachandran Chao Tian Paul Pu Liang 158 0 0 06 Oct 2025
InstructPLM-mu: 1-Hour Fine-Tuning of ESM2 Beats ESM3 in Protein Mutation Predictions Junde Xu Yapin Shi Lijun Lang Taoyong Cui Z. Zhang Guangyong Chen Jiezhong Qiu Pheng-Ann Heng 163 0 0 03 Oct 2025
SoK: Measuring What Matters for Closed-Loop Security Agents Mudita Khurana Raunak Jain ELM 80 0 0 02 Oct 2025
Beyond Simple Fusion: Adaptive Gated Fusion for Robust Multimodal Sentiment Analysis Han Wu Yanming Sun Yunhe Yang Derek F. Wong 137 0 0 02 Oct 2025
Massively Multimodal Foundation Models: A Framework for Capturing Dependencies with Specialized Mixture-of-Experts Xing Han Hsing-Huan Chung Joydeep Ghosh Paul Liang Suchi Saria MoE 223 0 0 30 Sep 2025
MAESTRO : Adaptive Sparse Attention and Robust Learning for Multimodal Dynamic Time Series Payal Mohapatra Yueyuan Sui Akash Pandey Stephen Xia Qi Zhu AI4TS 78 1 0 29 Sep 2025
InfMasking: Unleashing Synergistic Information by Contrastive Multimodal Interactions Liangjian Wen Qun Dai Jianzhuang Liu Jiangtao Zheng Yong Dai Dongkai Wang Zhao Kang Jun Wang Z. Xu Jiang Duan 226 0 0 28 Sep 2025
PHASE: Physics-Integrated, Heterogeneity-Aware Surrogates for Scientific Simulations Dawei Gao Dali Wang Zhuowei Gu Qinglei Cao Xiao Wang Peter Thornton Dan Ricciuto Yunhe Feng AI4CE 104 0 0 27 Sep 2025
Multi-modal Bayesian Neural Network Surrogates with Conjugate Last-Layer Estimation Ian Taylor Juliane Mueller Julie Bessac 76 0 0 26 Sep 2025
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data Jiancheng Zhang Yinglun Zhu 180 1 0 25 Sep 2025
Shaping Initial State Prevents Modality Competition in Multi-modal Fusion: A Two-stage Scheduling Framework via Fast Partial Information Decomposition Jiaqi Tang Yinsong Xu Yang Liu Qingchao Chen 127 0 0 25 Sep 2025
IndiSeek learns information-guided disentangled representations Yu Gui Cong Ma Zongming Ma DRL 407 0 0 25 Sep 2025
RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis Haolin Li Tianjie Dai Zhe Chen Siyuan Du Jiangchao Yao Ya Zhang Yanfeng Wang 120 0 0 24 Sep 2025
Single-Branch Network Architectures to Close the Modality Gap in Multimodal Recommendation Christian Ganhor Marta Moscati Anna Hausberger Shah Nawaz Markus Schedl HAI OffRL 120 0 0 23 Sep 2025