ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.03430
  4. Cited By
Foundations and Trends in Multimodal Machine Learning: Principles,
  Challenges, and Open Questions
v1v2 (latest)

Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

ACM Computing Surveys (ACM CSUR), 2022
7 September 2022
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
ArXiv (abs)PDFHTMLGithub

Papers citing "Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions"

50 / 56 papers shown
DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis
DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment AnalysisIEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2025
Yuhua Wen
Qifei Li
Yingying Zhou
Yingming Gao
Zhengqi Wen
Jianhua Tao
Ya Li
150
3
0
05 Dec 2025
Exploring Fusion Strategies for Multimodal Vision-Language Systems
Exploring Fusion Strategies for Multimodal Vision-Language Systems
Regan Willis
Jason Bakos
115
0
0
26 Nov 2025
Advanced Data Collection Techniques in Cloud Security: A Multi-Modal Deep Learning Autoencoder Approach
Advanced Data Collection Techniques in Cloud Security: A Multi-Modal Deep Learning Autoencoder Approach
Aamiruddin Syed
Mohammed Ilyas Ahmad
90
0
0
26 Nov 2025
Real-Time Inference for Distributed Multimodal Systems under Communication Delay Uncertainty
Real-Time Inference for Distributed Multimodal Systems under Communication Delay Uncertainty
Victor Croisfelt
João Henrique Inacio de Souza
Shashi Raj Pandey
B. Soret
P. Popovski
219
0
0
20 Nov 2025
When One Modality Sabotages the Others: A Diagnostic Lens on Multimodal Reasoning
When One Modality Sabotages the Others: A Diagnostic Lens on Multimodal Reasoning
Chenyu Zhang
Minsol Kim
Shohreh Ghorbani
Jingyao Wu
Rosalind Picard
Patricia Maes
Paul Pu Liang
168
3
0
04 Nov 2025
FairGRPO: Fair Reinforcement Learning for Equitable Clinical Reasoning
FairGRPO: Fair Reinforcement Learning for Equitable Clinical Reasoning
Shiqi Dai
Wei Dai
Jiaee Cheong
Paul Liang
FaMLOffRL
261
0
0
22 Oct 2025
Improving Speech Emotion Recognition with Mutual Information Regularized Generative Model
Improving Speech Emotion Recognition with Mutual Information Regularized Generative Model
Chung-Soo Ahn
R. Rana
Sunil Sivadas
Carlos Busso
Jagath Rajapakse
218
0
0
11 Oct 2025
Human Behavior Atlas: Benchmarking Unified Psychological and Social Behavior Understanding
Human Behavior Atlas: Benchmarking Unified Psychological and Social Behavior Understanding
Keane Ong
Wei Dai
Carol Li
Dewei Feng
Hengzhi Li
...
Jiaee Cheong
Rui Mao
G. Mengaldo
Erik Cambria
Paul Pu Liang
215
4
0
06 Oct 2025
Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts
Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts
Xing Han
Hsing-Huan Chung
Joydeep Ghosh
Paul Liang
Suchi Saria
MoE
321
0
0
30 Sep 2025
IndiSeek learns information-guided disentangled representations
IndiSeek learns information-guided disentangled representations
Yu Gui
Cong Ma
Zongming Ma
DRL
547
0
0
25 Sep 2025
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
Jiancheng Zhang
Yinglun Zhu
239
1
0
25 Sep 2025
M3ET: Efficient Vision-Language Learning for Robotics based on Multimodal Mamba-Enhanced Transformer
M3ET: Efficient Vision-Language Learning for Robotics based on Multimodal Mamba-Enhanced Transformer
Yanxin Zhang
Liang He
Zeyi Kang
Zuheng Ming
Kaixing Zhao
Mamba
177
0
0
22 Sep 2025
Arabic Multimodal Machine Learning: Datasets, Applications, Approaches, and Challenges
Arabic Multimodal Machine Learning: Datasets, Applications, Approaches, and Challenges
Abdelhamid Haouhat
Slimane Bellaouar
A. Nehar
H. Cherroun
Ahmed Abdelali
244
1
0
17 Aug 2025
Multimodal Remote Inference
Multimodal Remote Inference
Keyuan Zhang
Yin Sun
Bo Ji
OffRL
132
1
0
11 Aug 2025
MLLM-based Speech Recognition: When and How is Multimodality Beneficial?
MLLM-based Speech Recognition: When and How is Multimodality Beneficial?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
286
3
0
25 Jul 2025
IsoNet: Causal Analysis of Multimodal Transformers for Neuromuscular Gesture Classification
IsoNet: Causal Analysis of Multimodal Transformers for Neuromuscular Gesture Classification
Eion Tyacke
Kunal Gupta
Jay Patel
Rui Li
167
0
0
20 Jun 2025
A Survey on Large Language Models for Mathematical Reasoning
Peng-Yuan Wang
Tian-Shuo Liu
Chenyang Wang
Yi-Di Wang
Shu Yan
...
Xu-Hui Liu
Xin-Wei Chen
Jia-Cheng Xu
Ziniu Li
Yang Yu
LRM
377
34
0
10 Jun 2025
PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts
PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts
Hengzhi Li
Brendon Jiang
Alexander Naehu
Regan Song
Justin Zhang
...
Steven-Shine Chen
Adithya Balachandran
Wei Dai
Rebecca Chang
Paul Pu Liang
ReLMLRM
347
2
0
06 Jun 2025
MINT: Multimodal Instruction Tuning with Multimodal Interaction Grouping
MINT: Multimodal Instruction Tuning with Multimodal Interaction Grouping
Xiaojun Shan
Qi Cao
Xing Han
Haofei Yu
Paul Liang
389
3
0
02 Jun 2025
ICYM2I: The illusion of multimodal informativeness under missingness
ICYM2I: The illusion of multimodal informativeness under missingness
Young Sang Choi
Vincent Jeanselme
Pierre Elias
Shalmali Joshi
419
1
0
22 May 2025
Understanding the Capabilities of Molecular Graph Neural Networks in Materials Science Through Multimodal Learning and Physical Context Encoding
Understanding the Capabilities of Molecular Graph Neural Networks in Materials Science Through Multimodal Learning and Physical Context Encoding
Can Polat
Hasan Kurban
Erchin Serpedin
Mustafa Kurban
AI4CE
326
1
0
17 May 2025
Improving Coverage in Combined Prediction Sets with Weighted p-values
Improving Coverage in Combined Prediction Sets with Weighted p-values
Gina Wong
Drew Prinster
Suchi Saria
Rama Chellappa
Anqi Liu
351
0
0
17 May 2025
Robust Understanding of Human-Robot Social Interactions through Multimodal Distillation
Robust Understanding of Human-Robot Social Interactions through Multimodal Distillation
Tongfei Bian
Mathieu Chollet
T. Guha
383
1
0
06 May 2025
POET: Supporting Prompting Creativity and Personalization with Automated Expansion of Text-to-Image Generation
POET: Supporting Prompting Creativity and Personalization with Automated Expansion of Text-to-Image GenerationACM Symposium on User Interface Software and Technology (UIST), 2025
Evans Xu Han
Alice Qian Zhang
Haiyi Zhu
Haiyi Zhu
Paul Pu Liang
Jane Hsieh
493
6
0
18 Apr 2025
Engineering Artificial Intelligence: Framework, Challenges, and Future Direction
Engineering Artificial Intelligence: Framework, Challenges, and Future Direction
Jay Lee
Hanqi Su
Dai-Yan Ji
Takanobu Minami
AI4CE
524
5
0
03 Apr 2025
Translating Multimodal AI into Real-World Inspection: TEMAI Evaluation Framework and Pathways for Implementation
Translating Multimodal AI into Real-World Inspection: TEMAI Evaluation Framework and Pathways for Implementation
Hui Yuan
Jinzhi Deng
Haibing Ma
Chi Zhang
Dan Xiao
185
2
0
31 Mar 2025
Multimodal Machine Learning for Real Estate Appraisal: A Comprehensive Survey
Multimodal Machine Learning for Real Estate Appraisal: A Comprehensive SurveyPacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2025
Chenya Huang
Zhidong Li
Fang Chen
Bin Liang
237
0
0
28 Mar 2025
Multi-modal Time Series Analysis: A Tutorial and Survey
Multi-modal Time Series Analysis: A Tutorial and Survey
Yushan Jiang
Kanghui Ning
Zijie Pan
Xuyang Shen
Jingchao Ni
Wenchao Yu
Anderson Schneider
Haifeng Chen
Yuriy Nevmyvaka
Dongjin Song
AI4TS
987
38
0
17 Mar 2025
DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning
DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning
Chengxuan Qian
Shuo Xing
Shawn Li
Yue Zhao
Zhengzhong Tu
450
16
0
14 Mar 2025
Transforming Traditional Neural Networks into Neuromorphic Quantum-Cognitive Models: A Tutorial with Applications
Transforming Traditional Neural Networks into Neuromorphic Quantum-Cognitive Models: A Tutorial with Applications
Milan Maksimovic
Ivan S. Maksymov
359
0
0
10 Mar 2025
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models
Wei Dai
Peilin Chen
Malinda Lu
Daniel Li
Haowen Wei
Hejie Cui
Paul Pu Liang
LM&MA
411
12
0
09 Mar 2025
DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning
DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning
Chengxuan Qian
Kai Han
Jing Wang
Chongwen Lyu
Rui Qian
Chongwen Lyu
Zhenlong Yuan
Zhe Liu
Zhe-Yu Liu
494
18
0
09 Mar 2025
TabulaTime: A Novel Multimodal Deep Learning Framework for Advancing Acute Coronary Syndrome Prediction through Environmental and Clinical Data Integration
TabulaTime: A Novel Multimodal Deep Learning Framework for Advancing Acute Coronary Syndrome Prediction through Environmental and Clinical Data Integration
Xin Zhang
Liangxiu Han
Stephen White
Saad Hassan
Philip A Kalra
James Ritchie
Carl Diver
Jennie Shorley
434
1
0
24 Feb 2025
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
Hengzhi Li
Megan Tjandrasuwita
Yi R. Fung
Armando Solar-Lezama
Paul Pu Liang
601
8
0
23 Feb 2025
Understanding the Emergence of Multimodal Representation Alignment
Understanding the Emergence of Multimodal Representation Alignment
Megan Tjandrasuwita
Chanakya Ekbote
Liu Ziyin
Paul Pu Liang
431
17
0
22 Feb 2025
Modality Interactive Mixture-of-Experts for Fake News Detection
Modality Interactive Mixture-of-Experts for Fake News DetectionThe Web Conference (WWW), 2025
Yifan Liu
Y. Liu
Hui Yuan
Ruichen Yao
Yang Zhang
Dong Wang
MoE
427
20
0
21 Jan 2025
MM-Path: Multi-modal, Multi-granularity Path Representation Learning -- Extended Version
MM-Path: Multi-modal, Multi-granularity Path Representation Learning -- Extended VersionKnowledge Discovery and Data Mining (KDD), 2024
Ronghui Xu
Hanyin Cheng
Chenjuan Guo
Hongfan Gao
Jiaxi Hu
Sean Bin Yang
Bin Yang
748
16
0
03 Jan 2025
Designing a Robust Radiology Report Generation System
Designing a Robust Radiology Report Generation System
Sonit Singh
MedIm
310
1
0
02 Nov 2024
Progressive Compositionality in Text-to-Image Generative Models
Progressive Compositionality in Text-to-Image Generative ModelsInternational Conference on Learning Representations (ICLR), 2024
Xu Han
Linghao Jin
Xiaofeng Liu
Paul Pu Liang
CoGe
541
10
0
22 Oct 2024
The Curse of Multi-Modalities: Evaluating Hallucinations of Large
  Multimodal Models across Language, Visual, and Audio
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Sicong Leng
Yun Xing
Zesen Cheng
Yang Zhou
Hang Zhang
Xin Li
Deli Zhao
Shijian Lu
Chunyan Miao
Lidong Bing
402
34
0
16 Oct 2024
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations
Minoh Jeong
Min Namgung
Min Namgung
Luan Tuyen Chau
Yao-Yi Chiang
Alfred Hero
531
3
0
02 Oct 2024
Rethinking the Power of Timestamps for Robust Time Series Forecasting: A
  Global-Local Fusion Perspective
Rethinking the Power of Timestamps for Robust Time Series Forecasting: A Global-Local Fusion PerspectiveNeural Information Processing Systems (NeurIPS), 2024
Chengsen Wang
Qi Qi
Jingyu Wang
Haifeng Sun
Zirui Zhuang
Jinming Wu
Jianxin Liao
AI4TS
228
32
0
27 Sep 2024
Fusion in Context: A Multimodal Approach to Affective State Recognition
Fusion in Context: A Multimodal Approach to Affective State Recognition
Youssef Mohamed
Séverin Lemaignan
Arzu Guneysu
Patric Jensfelt
Christian Smith
353
2
0
18 Sep 2024
Segment Anything with Multiple Modalities
Segment Anything with Multiple Modalities
Aoran Xiao
Weihao Xuan
Heli Qi
Yun Xing
Xiangwei Zhu
Shijian Lu
VLM
362
14
0
17 Aug 2024
End-to-end Semantic-centric Video-based Multimodal Affective Computing
End-to-end Semantic-centric Video-based Multimodal Affective Computing
Ronghao Lin
Ying Zeng
Sijie Mai
Haifeng Hu
VGen
358
4
0
14 Aug 2024
IoT-LM: Large Multisensory Language Models for the Internet of Things
IoT-LM: Large Multisensory Language Models for the Internet of Things
Shentong Mo
Russ Salakhutdinov
Louis-Philippe Morency
Paul Pu Liang
MLLM
222
23
0
13 Jul 2024
HEMM: Holistic Evaluation of Multimodal Foundation Models
HEMM: Holistic Evaluation of Multimodal Foundation Models
Paul Pu Liang
Akshay Goindani
Talha Chafekar
Leena Mathur
Haofei Yu
Ruslan Salakhutdinov
Louis-Philippe Morency
438
30
0
03 Jul 2024
RiskLabs: Predicting Financial Risk Using Large Language Model based on Multimodal and Multi-Sources Data
RiskLabs: Predicting Financial Risk Using Large Language Model based on Multimodal and Multi-Sources Data
Yun Feng
Zhi Chen
Prashant Kumar
Qingyun Pei
Yangyang Yu
Haohang Li
Fabrizio Dimino
Lorenzo Ausiello
K. P. Subbalakshmi
Papa Momar Ndiaye
234
16
0
11 Apr 2024
Global Contrastive Training for Multimodal Electronic Health Records
  with Language Supervision
Global Contrastive Training for Multimodal Electronic Health Records with Language Supervision
Yingbo Ma
Suraj Kolla
Zhenhong Hu
Dhruv Kaliraman
Victoria Nolan
...
Jeremy A. Balch
Tyler J. Loftus
Parisa Rashidi
A. Bihorac
B. Shickel
AI4TS
252
6
0
10 Apr 2024
Cohort-Individual Cooperative Learning for Multimodal Cancer Survival
  Analysis
Cohort-Individual Cooperative Learning for Multimodal Cancer Survival AnalysisIEEE Transactions on Medical Imaging (IEEE TMI), 2024
Huajun Zhou
Fengtao Zhou
Hao Chen
243
25
0
03 Apr 2024
12
Next
Page 1 of 2