Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Home
Papers
1705.09406
Cited By
v1
v2 (latest)
Multimodal Machine Learning: A Survey and Taxonomy
26 May 2017
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multimodal Machine Learning: A Survey and Taxonomy"
50 / 941 papers shown
Understanding the Emergence of Multimodal Representation Alignment
Megan Tjandrasuwita
Chanakya Ekbote
Liu Ziyin
Paul Pu Liang
327
14
0
22 Feb 2025
CrossOver: 3D Scene Cross-Modal Alignment
Computer Vision and Pattern Recognition (CVPR), 2025
S. Sarkar
O. Mikšík
Marc Pollefeys
Daniel Barath
Iro Armeni
3DPC
396
6
0
20 Feb 2025
RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm
Tiancheng Gu
Kaicheng Yang
Chaoyi Zhang
Yin Xie
Xiang An
Ziyong Feng
Dongnan Liu
Weidong Cai
Jiankang Deng
CLIP
VLM
495
5
0
18 Feb 2025
Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review
Ufaq Khan
Umair Nawaz
A. Qayyum
Shazad Ashraf
Yutong Xie
Muhammad Haris Khan
Muhammad Bilal
Junaid Qadir
470
5
0
16 Feb 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
695
29
0
12 Feb 2025
"I am bad": Interpreting Stealthy, Universal and Robust Audio Jailbreaks in Audio-Language Models
Isha Gupta
David Khachaturov
Robert D. Mullins
AAML
AuLLM
551
5
0
02 Feb 2025
Reducing Aleatoric and Epistemic Uncertainty through Multi-modal Data Acquisition
Arthur Hoarau
Benjamin Quost
Sébastien Destercke
Willem Waegeman
UQCV
UD
PER
348
3
0
30 Jan 2025
High-dimensional multimodal uncertainty estimation by manifold alignment:Application to 3D right ventricular strain computations
Maxime Di Folco
Gabriel Bernardino
Patrick Clarysse
Nicolas Duchateau
219
1
0
21 Jan 2025
Multi-Modality Collaborative Learning for Sentiment Analysis
Shanmin Wang
Chengguang Liu
Qingshan Liu
168
0
0
21 Jan 2025
Fake Advertisements Detection Using Automated Multimodal Learning: A Case Study for Vietnamese Real Estate Data
Duy Nguyen
Trung Quoc Nguyen
Cuong V Nguyen
299
2
0
18 Jan 2025
Dynamic Multimodal Fusion via Meta-Learning Towards Micro-Video Recommendation
Han Liu
Yinwei Wei
Fan Liu
Wenjie Wang
Liqiang Nie
Tat-Seng Chua
278
34
0
13 Jan 2025
Unity by Diversity: Improved Representation Learning in Multimodal VAEs
Neural Information Processing Systems (NeurIPS), 2024
Thomas M. Sutter
Yang Meng
Andrea Agostini
Daphné Chopard
Norbert Fortin
Julia E. Vogt
Bahbak Shahbaba
Stephan Mandt
SSL
507
8
0
08 Jan 2025
Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform
Journal of Intelligence and Information Systems (JIIS), 2025
Cheonsu Jeong
443
11
0
01 Jan 2025
Enhancing Visual Representation for Text-based Person Searching
Knowledge-Based Systems (KBS), 2024
Wei Shen
Ming Fang
Yuxia Wang
Jiafeng Xiao
Diping Li
Ningyu Zhang
Ling Xu
Weinan Zhang
286
5
0
31 Dec 2024
Towards Visual Grounding: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
955
31
0
28 Dec 2024
Unimodal and Multimodal Static Facial Expression Recognition for Virtual Reality Users with EmoHeVRDB
Thorben Ortmann
Qi Wang
Larissa Putzar
201
4
0
15 Dec 2024
ViSymRe: Vision-guided Multimodal Symbolic Regression
Da Li
Junping Yin
Jin Xu
Xinxin Li
Juan Zhang
337
1
0
15 Dec 2024
Grasp What You Want: Embodied Dexterous Grasping System Driven by Your Voice
Junliang Li
Kai Ye
Haolan Kang
Mingxuan Liang
Yuhang Wu
Zhenhua Liu
Huiping Zhuang
Rui Huang
Yongquan Chen
301
2
0
14 Dec 2024
A Review of Human Emotion Synthesis Based on Generative Technology
Fei Ma
Yongqian Li
Yifan Xie
Y. He
Yujiao Shi
...
Z. Liu
Wei Yao
Fuji Ren
Fei Richard Yu
Shiguang Ni
286
7
0
10 Dec 2024
On Moving Object Segmentation from Monocular Video with Transformers
Christian Homeyer
Christoph Schnörr
263
3
0
28 Nov 2024
VLM-HOI: Vision Language Models for Interpretable Human-Object Interaction Analysis
Donggoo Kang
Dasol Jeong
Hyunmin Lee
Sangwoo Park
Hasil Park
Sunkyu Kwon
Yeongjoon Kim
Joonki Paik
MLLM
VLM
336
1
0
27 Nov 2024
Multimodal Integration of Longitudinal Noninvasive Diagnostics for Survival Prediction in Immunotherapy Using Deep Learning
Melda Yeghaian
Zuhir Bodalal
Daan van den Broek
John B A G Haanen
Regina G H Beets-Tan
Stefano Trebeschi
Marcel A J van Gerven
308
2
0
27 Nov 2024
Incomplete Multi-view Multi-label Classification via a Dual-level Contrastive Learning Framework
Bingyan Nie
Wulin Xie
Jiang Long
Xiaohuan Lu
273
1
0
27 Nov 2024
On the ERM Principle in Meta-Learning
Yannay Alon
Steve Hanneke
Shay Moran
Uri Shalit
CLL
LRM
277
2
0
26 Nov 2024
FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval
Jingyou Xie
Jiayi Kuang
Zhenzhou Lin
Jiarui Ouyang
Zishuo Zhao
Ying Shen
VLM
CLIP
300
0
0
26 Nov 2024
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
ACM Computing Surveys (ACM CSUR), 2024
Luis Vilaca
Yi Yu
Paula Vinan
472
3
0
24 Nov 2024
Multi-Task Adversarial Variational Autoencoder for Estimating Biological Brain Age with Multimodal Neuroimaging
Muhammad Usman
Azka Rehman
Abdullah Shahid
A. Rehman
Sung-Min Gho
Aleum Lee
Tariq Mahmood Khan
Imran Razzak
180
1
0
15 Nov 2024
Bi-Level Graph Structure Learning for Next POI Recommendation
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Liang Wang
Shu Wu
Qiang Liu
Yinlin Zhu
Xiang Tao
Hao Fei
Shu Wu
234
16
0
02 Nov 2024
Analyzing Multimodal Integration in the Variational Autoencoder from an Information-Theoretic Perspective
Carlotta Langer
Yasmin Kim Georgie
Ilja Porohovoj
Verena V. Hafner
Nihat Ay
DRL
244
2
0
01 Nov 2024
JEMA: A Joint Embedding Framework for Scalable Co-Learning with Multimodal Alignment
Joao Sousa
Roya Darabi
A. A. Sousa
Frank Brueckner
Luís Paulo Reis
Ana Reis
215
2
0
31 Oct 2024
SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Sunil Aryal
Imran Razzak
Hakim Hacid
233
0
0
30 Oct 2024
Multimodal Structure Preservation Learning
Chang Liu
Jieshi Chen
Lee H. Harrison
A. Dubrawski
175
0
0
29 Oct 2024
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques
Applied Soft Computing (Appl. Soft Comput.), 2024
David Ortiz-Perez
Manuel Benavent-Lledo
José García Rodríguez
David Tomás
M. Flores Vizcaya-Moreno
227
3
0
24 Oct 2024
Enhancing Unimodal Latent Representations in Multimodal VAEs through Iterative Amortized Inference
Yuta Oshima
Masahiro Suzuki
Y. Matsuo
203
0
0
15 Oct 2024
Multimodal Physical Activity Forecasting in Free-Living Clinical Settings: Hunting Opportunities for Just-in-Time Interventions
Abdullah Mamun
Krista S. Leonard
Megan E. Petrov
Matthew P. Buman
Hassan Ghasemzadeh
113
2
0
12 Oct 2024
A social context-aware graph-based multimodal attentive learning framework for disaster content classification during emergencies: a benchmark dataset and method
Expert systems with applications (ESWA), 2024
Shahid Shafi Dar
Mohammad Zia Ur Rehman
Karan Bais
Mohammed Abdul Haseeb
Nagendra Kumara
213
21
0
11 Oct 2024
Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts
Neural Information Processing Systems (NeurIPS), 2024
Sukwon Yun
Inyoung Choi
Jie Peng
Yangfan Wu
J. Bao
Qiyiwen Zhang
Jiayi Xin
Qi Long
Tianlong Chen
MoE
302
40
0
10 Oct 2024
STNet: Deep Audio-Visual Fusion Network for Robust Speaker Tracking
IEEE transactions on multimedia (IEEE TMM), 2024
Yidi Li
Hong Liu
Bing Yang
345
7
0
08 Oct 2024
Recent Advances of Multimodal Continual Learning: A Comprehensive Survey
Dianzhi Yu
Xinni Zhang
Yankai Chen
Aiwei Liu
Yifei Zhang
Philip S. Yu
Irwin King
VLM
CLL
354
30
0
07 Oct 2024
DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Divya J. Bajpai
M. Hanawal
202
2
0
06 Oct 2024
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Sungnyun Kim
Haofu Liao
Srikar Appalaraju
Peng Tang
Zhuowen Tu
R. Satzoda
R. Manmatha
Vijay Mahadevan
Stefano Soatto
264
2
0
04 Oct 2024
MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection
Niki Nezakati
Md Kaykobad Reza
Mashhour Solh
Mashhour Solh
M. Salman Asif
395
5
0
03 Oct 2024
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
International Conference on Learning Representations (ICLR), 2024
Wanpeng Zhang
Zilong Xie
Yicheng Feng
Yijiang Li
Xingrun Xing
Sipeng Zheng
Zongqing Lu
MLLM
353
10
0
03 Oct 2024
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations
Minoh Jeong
Min Namgung
Min Namgung
Luan Tuyen Chau
Yao-Yi Chiang
Alfred Hero
445
3
0
02 Oct 2024
Learning Multimodal Latent Generative Models with Energy-Based Prior
European Conference on Computer Vision (ECCV), 2024
Shiyu Yuan
Jiali Cui
Hanao Li
Tian Han
264
3
0
30 Sep 2024
Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification
Raja Kumar
Raghav Singhal
Pranamya Kulkarni
Deval Mehta
Kshitij S. Jadhav
442
3
0
26 Sep 2024
PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences
Micro (MICRO), 2024
Pingyi Huo
Anusha Devulapally
Hasan Al Maruf
Minseo Park
Krishnakumar Nair
Meena Arunachalam
Gulsum Gudukbay Akbulut
M. Kandemir
Vijaykrishnan Narayanan
168
5
0
25 Sep 2024
Video-to-Audio Generation with Fine-grained Temporal Semantics
Yuchen Hu
Yu Gu
Chenxing Li
Rilin Chen
Dong Yu
VGen
DiffM
259
4
0
23 Sep 2024
Measuring Sound Symbolism in Audio-visual Models
Spoken Language Technology Workshop (SLT), 2024
Wei-Cheng Tseng
Yi-Jen Shih
David Harwath
Raymond Mooney
308
2
0
18 Sep 2024
Fusion in Context: A Multimodal Approach to Affective State Recognition
Youssef Mohamed
Séverin Lemaignan
Arzu Guneysu
Patric Jensfelt
Christian Smith
262
2
0
18 Sep 2024
Previous
1
2
3
4
5
...
17
18
19
Next