Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2209.03430
Cited By
v1
v2 (latest)
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
ACM Computing Surveys (ACM CSUR), 2022
7 September 2022
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions"
50 / 56 papers shown
DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis
IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2025
Yuhua Wen
Qifei Li
Yingying Zhou
Yingming Gao
Zhengqi Wen
Jianhua Tao
Ya Li
150
3
0
05 Dec 2025
Exploring Fusion Strategies for Multimodal Vision-Language Systems
Regan Willis
Jason Bakos
115
0
0
26 Nov 2025
Advanced Data Collection Techniques in Cloud Security: A Multi-Modal Deep Learning Autoencoder Approach
Aamiruddin Syed
Mohammed Ilyas Ahmad
90
0
0
26 Nov 2025
Real-Time Inference for Distributed Multimodal Systems under Communication Delay Uncertainty
Victor Croisfelt
João Henrique Inacio de Souza
Shashi Raj Pandey
B. Soret
P. Popovski
219
0
0
20 Nov 2025
When One Modality Sabotages the Others: A Diagnostic Lens on Multimodal Reasoning
Chenyu Zhang
Minsol Kim
Shohreh Ghorbani
Jingyao Wu
Rosalind Picard
Patricia Maes
Paul Pu Liang
168
3
0
04 Nov 2025
FairGRPO: Fair Reinforcement Learning for Equitable Clinical Reasoning
Shiqi Dai
Wei Dai
Jiaee Cheong
Paul Liang
FaML
OffRL
261
0
0
22 Oct 2025
Improving Speech Emotion Recognition with Mutual Information Regularized Generative Model
Chung-Soo Ahn
R. Rana
Sunil Sivadas
Carlos Busso
Jagath Rajapakse
218
0
0
11 Oct 2025
Human Behavior Atlas: Benchmarking Unified Psychological and Social Behavior Understanding
Keane Ong
Wei Dai
Carol Li
Dewei Feng
Hengzhi Li
...
Jiaee Cheong
Rui Mao
G. Mengaldo
Erik Cambria
Paul Pu Liang
215
4
0
06 Oct 2025
Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts
Xing Han
Hsing-Huan Chung
Joydeep Ghosh
Paul Liang
Suchi Saria
MoE
321
0
0
30 Sep 2025
IndiSeek learns information-guided disentangled representations
Yu Gui
Cong Ma
Zongming Ma
DRL
547
0
0
25 Sep 2025
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
Jiancheng Zhang
Yinglun Zhu
239
1
0
25 Sep 2025
M3ET: Efficient Vision-Language Learning for Robotics based on Multimodal Mamba-Enhanced Transformer
Yanxin Zhang
Liang He
Zeyi Kang
Zuheng Ming
Kaixing Zhao
Mamba
177
0
0
22 Sep 2025
Arabic Multimodal Machine Learning: Datasets, Applications, Approaches, and Challenges
Abdelhamid Haouhat
Slimane Bellaouar
A. Nehar
H. Cherroun
Ahmed Abdelali
244
1
0
17 Aug 2025
Multimodal Remote Inference
Keyuan Zhang
Yin Sun
Bo Ji
OffRL
132
1
0
11 Aug 2025
MLLM-based Speech Recognition: When and How is Multimodality Beneficial?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
286
3
0
25 Jul 2025
IsoNet: Causal Analysis of Multimodal Transformers for Neuromuscular Gesture Classification
Eion Tyacke
Kunal Gupta
Jay Patel
Rui Li
167
0
0
20 Jun 2025
A Survey on Large Language Models for Mathematical Reasoning
Peng-Yuan Wang
Tian-Shuo Liu
Chenyang Wang
Yi-Di Wang
Shu Yan
...
Xu-Hui Liu
Xin-Wei Chen
Jia-Cheng Xu
Ziniu Li
Yang Yu
LRM
377
34
0
10 Jun 2025
PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts
Hengzhi Li
Brendon Jiang
Alexander Naehu
Regan Song
Justin Zhang
...
Steven-Shine Chen
Adithya Balachandran
Wei Dai
Rebecca Chang
Paul Pu Liang
ReLM
LRM
347
2
0
06 Jun 2025
MINT: Multimodal Instruction Tuning with Multimodal Interaction Grouping
Xiaojun Shan
Qi Cao
Xing Han
Haofei Yu
Paul Liang
389
3
0
02 Jun 2025
ICYM2I: The illusion of multimodal informativeness under missingness
Young Sang Choi
Vincent Jeanselme
Pierre Elias
Shalmali Joshi
419
1
0
22 May 2025
Understanding the Capabilities of Molecular Graph Neural Networks in Materials Science Through Multimodal Learning and Physical Context Encoding
Can Polat
Hasan Kurban
Erchin Serpedin
Mustafa Kurban
AI4CE
326
1
0
17 May 2025
Improving Coverage in Combined Prediction Sets with Weighted p-values
Gina Wong
Drew Prinster
Suchi Saria
Rama Chellappa
Anqi Liu
351
0
0
17 May 2025
Robust Understanding of Human-Robot Social Interactions through Multimodal Distillation
Tongfei Bian
Mathieu Chollet
T. Guha
383
1
0
06 May 2025
POET: Supporting Prompting Creativity and Personalization with Automated Expansion of Text-to-Image Generation
ACM Symposium on User Interface Software and Technology (UIST), 2025
Evans Xu Han
Alice Qian Zhang
Haiyi Zhu
Haiyi Zhu
Paul Pu Liang
Jane Hsieh
493
6
0
18 Apr 2025
Engineering Artificial Intelligence: Framework, Challenges, and Future Direction
Jay Lee
Hanqi Su
Dai-Yan Ji
Takanobu Minami
AI4CE
524
5
0
03 Apr 2025
Translating Multimodal AI into Real-World Inspection: TEMAI Evaluation Framework and Pathways for Implementation
Hui Yuan
Jinzhi Deng
Haibing Ma
Chi Zhang
Dan Xiao
185
2
0
31 Mar 2025
Multimodal Machine Learning for Real Estate Appraisal: A Comprehensive Survey
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2025
Chenya Huang
Zhidong Li
Fang Chen
Bin Liang
237
0
0
28 Mar 2025
Multi-modal Time Series Analysis: A Tutorial and Survey
Yushan Jiang
Kanghui Ning
Zijie Pan
Xuyang Shen
Jingchao Ni
Wenchao Yu
Anderson Schneider
Haifeng Chen
Yuriy Nevmyvaka
Dongjin Song
AI4TS
987
38
0
17 Mar 2025
DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning
Chengxuan Qian
Shuo Xing
Shawn Li
Yue Zhao
Zhengzhong Tu
450
16
0
14 Mar 2025
Transforming Traditional Neural Networks into Neuromorphic Quantum-Cognitive Models: A Tutorial with Applications
Milan Maksimovic
Ivan S. Maksymov
359
0
0
10 Mar 2025
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models
Wei Dai
Peilin Chen
Malinda Lu
Daniel Li
Haowen Wei
Hejie Cui
Paul Pu Liang
LM&MA
411
12
0
09 Mar 2025
DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning
Chengxuan Qian
Kai Han
Jing Wang
Chongwen Lyu
Rui Qian
Chongwen Lyu
Zhenlong Yuan
Zhe Liu
Zhe-Yu Liu
494
18
0
09 Mar 2025
TabulaTime: A Novel Multimodal Deep Learning Framework for Advancing Acute Coronary Syndrome Prediction through Environmental and Clinical Data Integration
Xin Zhang
Liangxiu Han
Stephen White
Saad Hassan
Philip A Kalra
James Ritchie
Carl Diver
Jennie Shorley
434
1
0
24 Feb 2025
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
Hengzhi Li
Megan Tjandrasuwita
Yi R. Fung
Armando Solar-Lezama
Paul Pu Liang
601
8
0
23 Feb 2025
Understanding the Emergence of Multimodal Representation Alignment
Megan Tjandrasuwita
Chanakya Ekbote
Liu Ziyin
Paul Pu Liang
431
17
0
22 Feb 2025
Modality Interactive Mixture-of-Experts for Fake News Detection
The Web Conference (WWW), 2025
Yifan Liu
Y. Liu
Hui Yuan
Ruichen Yao
Yang Zhang
Dong Wang
MoE
427
20
0
21 Jan 2025
MM-Path: Multi-modal, Multi-granularity Path Representation Learning -- Extended Version
Knowledge Discovery and Data Mining (KDD), 2024
Ronghui Xu
Hanyin Cheng
Chenjuan Guo
Hongfan Gao
Jiaxi Hu
Sean Bin Yang
Bin Yang
748
16
0
03 Jan 2025
Designing a Robust Radiology Report Generation System
Sonit Singh
MedIm
310
1
0
02 Nov 2024
Progressive Compositionality in Text-to-Image Generative Models
International Conference on Learning Representations (ICLR), 2024
Xu Han
Linghao Jin
Xiaofeng Liu
Paul Pu Liang
CoGe
541
10
0
22 Oct 2024
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Sicong Leng
Yun Xing
Zesen Cheng
Yang Zhou
Hang Zhang
Xin Li
Deli Zhao
Shijian Lu
Chunyan Miao
Lidong Bing
402
34
0
16 Oct 2024
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations
Minoh Jeong
Min Namgung
Min Namgung
Luan Tuyen Chau
Yao-Yi Chiang
Alfred Hero
531
3
0
02 Oct 2024
Rethinking the Power of Timestamps for Robust Time Series Forecasting: A Global-Local Fusion Perspective
Neural Information Processing Systems (NeurIPS), 2024
Chengsen Wang
Qi Qi
Jingyu Wang
Haifeng Sun
Zirui Zhuang
Jinming Wu
Jianxin Liao
AI4TS
228
32
0
27 Sep 2024
Fusion in Context: A Multimodal Approach to Affective State Recognition
Youssef Mohamed
Séverin Lemaignan
Arzu Guneysu
Patric Jensfelt
Christian Smith
353
2
0
18 Sep 2024
Segment Anything with Multiple Modalities
Aoran Xiao
Weihao Xuan
Heli Qi
Yun Xing
Xiangwei Zhu
Shijian Lu
VLM
362
14
0
17 Aug 2024
End-to-end Semantic-centric Video-based Multimodal Affective Computing
Ronghao Lin
Ying Zeng
Sijie Mai
Haifeng Hu
VGen
358
4
0
14 Aug 2024
IoT-LM: Large Multisensory Language Models for the Internet of Things
Shentong Mo
Russ Salakhutdinov
Louis-Philippe Morency
Paul Pu Liang
MLLM
222
23
0
13 Jul 2024
HEMM: Holistic Evaluation of Multimodal Foundation Models
Paul Pu Liang
Akshay Goindani
Talha Chafekar
Leena Mathur
Haofei Yu
Ruslan Salakhutdinov
Louis-Philippe Morency
438
30
0
03 Jul 2024
RiskLabs: Predicting Financial Risk Using Large Language Model based on Multimodal and Multi-Sources Data
Yun Feng
Zhi Chen
Prashant Kumar
Qingyun Pei
Yangyang Yu
Haohang Li
Fabrizio Dimino
Lorenzo Ausiello
K. P. Subbalakshmi
Papa Momar Ndiaye
234
16
0
11 Apr 2024
Global Contrastive Training for Multimodal Electronic Health Records with Language Supervision
Yingbo Ma
Suraj Kolla
Zhenhong Hu
Dhruv Kaliraman
Victoria Nolan
...
Jeremy A. Balch
Tyler J. Loftus
Parisa Rashidi
A. Bihorac
B. Shickel
AI4TS
252
6
0
10 Apr 2024
Cohort-Individual Cooperative Learning for Multimodal Cancer Survival Analysis
IEEE Transactions on Medical Imaging (IEEE TMI), 2024
Huajun Zhou
Fengtao Zhou
Hao Chen
243
25
0
03 Apr 2024
1
2
Next
Page 1 of 2