Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1705.09406
Cited By
v1
v2 (latest)
Multimodal Machine Learning: A Survey and Taxonomy
26 May 2017
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multimodal Machine Learning: A Survey and Taxonomy"
50 / 941 papers shown
Multimodal Generalized Category Discovery
Yuchang Su
Renping Zhou
Siyu Huang
Xingjian Li
Tianyang Wang
Ziyue Wang
Min Xu
270
6
0
18 Sep 2024
PixelBytes: Catching Unified Representation for Multimodal Generation
Fabien Furfaro
126
1
0
16 Sep 2024
One missing piece in Vision and Language: A Survey on Comics Understanding
Emanuele Vivoli
Andrey Barsky
Mohamed Ali Souibgui
Artemis LLabres
Marco Bertini
Dimosthenis Karatzas
335
7
0
14 Sep 2024
Early Joint Learning of Emotion Information Makes MultiModal Model Understand You Better
Mengying Ge
Mingyang Li
Dongkai Tang
Pengbo Li
Kuo Liu
Shuhao Deng
Songbai Pu
Liu Liu
Yang Song
Tao Zhang
233
7
0
12 Sep 2024
Recent Trends of Multimodal Affective Computing: A Survey from NLP Perspective
Guimin Hu
Yi Xin
Weimin Lyu
Haojian Huang
Chang Sun
Zehan Zhu
Lin Gui
Ruichu Cai
Erik Cambria
Hasti Seifi
368
15
0
11 Sep 2024
What to align in multimodal contrastive learning?
International Conference on Learning Representations (ICLR), 2024
Benoit Dufumier
J. Castillo-Navarro
D. Tuia
Jean-Philippe Thiran
341
30
0
11 Sep 2024
PixelBytes: Catching Unified Embedding for Multimodal Generation
Fabien Furfaro
99
0
0
03 Sep 2024
Subgroup Analysis via Model-based Rule Forest
IEEE International Conference on Information Reuse and Integration (IRI), 2024
I-Ling Cheng
Chan Hsu
Chantung Ku
Pei-Ju Lee
Yihuang Kang
111
0
0
27 Aug 2024
X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation
Hanjia Lyu
Ryan Rossi
Xiang Chen
Md Mehrab Tanjim
Stefano Petrangeli
Somdeb Sarkhel
Jiebo Luo
136
7
0
27 Aug 2024
Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey
Information Fusion (Inf. Fusion), 2024
Qika Lin
Yifan Zhu
Xin Mei
Ling Huang
Jingying Ma
Kai He
Zhen Peng
Xiaoshi Zhong
Mengling Feng
288
61
0
23 Aug 2024
MultiMed: Massively Multimodal and Multitask Medical Understanding
Shentong Mo
Paul Pu Liang
LM&MA
245
6
0
22 Aug 2024
Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation
AAAI Conference on Artificial Intelligence (AAAI), 2024
Yuyang Ye
Zhi Zheng
Yishan Shen
Tianshu Wang
Hengruo Zhang
Peijun Zhu
Runlong Yu
Kai Zhang
Hui Xiong
433
44
0
19 Aug 2024
A Survey on Integrated Sensing, Communication, and Computation
IEEE Communications Surveys and Tutorials (COMST), 2024
Dingzhu Wen
Yong Zhou
Xiaoyang Li
Yuanming Shi
Kaibin Huang
Khaled B. Letaief
236
122
0
15 Aug 2024
End-to-end Semantic-centric Video-based Multimodal Affective Computing
Ronghao Lin
Ying Zeng
Sijie Mai
Haifeng Hu
VGen
290
2
0
14 Aug 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
335
4
0
10 Aug 2024
A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications
Image and Vision Computing (IVC), 2024
V. Guarrasi
Fatih Aksu
Camillo Maria Caruso
Francesco Di Feola
Aurora Rofena
Filippo Ruffini
Paolo Soda
OffRL
MedIm
AI4CE
188
51
0
02 Aug 2024
HyperMM : Robust Multimodal Learning with Varying-sized Inputs
Hava Chaptoukaev
Vincenzo Marcianó
Francesco Galati
Maria A. Zuluaga
199
1
0
30 Jul 2024
Appformer: A Novel Framework for Mobile App Usage Prediction Leveraging Progressive Multi-Modal Data Fusion and Feature Extraction
Expert systems with applications (ESWA), 2024
Chuike Sun
Junzhou Chen
Yue Zhao
Hao Han
Ruihai Jing
Guang Tan
Di Wu
224
4
0
28 Jul 2024
Automated Ensemble Multimodal Machine Learning for Healthcare
F. Imrie
Stefan Denner
Lucas S. Brunschwig
Klaus H. Maier-Hein
M. Schaar
217
11
1
25 Jul 2024
Chameleon: Images Are What You Need For Multimodal Learning Robust To Missing Modalities
Muhammad Irzam Liaqat
Shah Nawaz
Muhammad Zaigham Zaheer
M. S. Saeed
Hassan Sajjad
Tom De Schepper
Karthik Nandakumar
Muhammad Haris Khan
319
1
0
23 Jul 2024
Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training
Ye Lin Tun
Chu Myaet Thwal
Minh N. H. Nguyen
Choong Seon Hong
283
4
0
22 Jul 2024
Benchmark Granularity and Model Robustness for Image-Text Retrieval
Mariya Hendriksen
Shuo Zhang
R. Reinanda
Mohamed Yahya
Edgar Meij
Maarten de Rijke
305
0
0
21 Jul 2024
Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation
Yu Zhang
Ruijie Yu
Kaipeng Zeng
Ding Li
Feng Zhu
Yunbo Wang
Yaohui Jin
Yanyan Xu
176
1
0
21 Jul 2024
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
490
84
0
20 Jul 2024
Towards Interpretable Visuo-Tactile Predictive Models for Soft Robot Interactions
Enrico Donato
T. G. Thuruthel
Egidio Falotico
180
1
0
16 Jul 2024
IoT-LM: Large Multisensory Language Models for the Internet of Things
Shentong Mo
Russ Salakhutdinov
Louis-Philippe Morency
Paul Pu Liang
MLLM
187
20
0
13 Jul 2024
Diagnosing and Re-learning for Balanced Multimodal Learning
Yake Wei
Siwei Li
Ruoxuan Feng
Di Hu
215
34
0
12 Jul 2024
Specialized curricula for training vision-language models in retinal image analysis
Robbie Holland
Thomas R. P. Taylor
Christopher Holmes
Sophie Riedl
Julia Mai
...
U. Schmidt-Erfurth
Daniel Rueckert
S. Sivaprasad
A. Lotery
Fernando Navarro
VLM
LM&MA
100
0
0
11 Jul 2024
TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data
Siyi Du
Shaoming Zheng
Yinsong Wang
Wenjia Bai
D. O’Regan
Chen Qin
LMTD
260
20
0
10 Jul 2024
Multimodal Chain-of-Thought Reasoning via ChatGPT to Protect Children from Age-Inappropriate Apps
Chuanbo Hu
Bin Liu
Minglei Yin
Yilu Zhou
Xin Li
LRM
117
5
0
08 Jul 2024
Completed Feature Disentanglement Learning for Multimodal MRIs Analysis
Tianling Liu
Hongying Liu
Fanhua Shang
Lequan Yu
Tong Han
Liang Wan
381
3
0
06 Jul 2024
Multimodal Classification via Modal-Aware Interactive Enhancement
Qing-Yuan Jiang
Zhouyang Chi
Yang Yang
227
3
0
05 Jul 2024
Multi-modal Masked Siamese Network Improves Chest X-Ray Representation Learning
Saeed Shurrab
Alejandro Guerra-Manzanares
Farah E. Shamout
242
4
0
05 Jul 2024
Hard-Attention Gates with Gradient Routing for Endoscopic Image Computing
Giorgio Roffo
Carlo Biffi
Pietro Salvagnini
Andrea Cherubini
222
1
0
05 Jul 2024
Optimal thresholds and algorithms for a model of multi-modal learning in high dimensions
Christian Keup
Lenka Zdeborová
363
3
0
03 Jul 2024
Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection
Zixing Li
Chao Yan
Zhen Lan
Xiaojia Xiang
Han Zhou
Jun Lai
Dengqing Tang
288
2
0
02 Jul 2024
Multimodal Data Integration for Precision Oncology: Challenges and Future Directions
Huajun Zhou
Fengtao Zhou
Chenyu Zhao
Yingxue Xu
Luyang Luo
Hao Chen
311
18
0
28 Jun 2024
A Survey on Mixture of Experts in Large Language Models
Weilin Cai
Juyong Jiang
Fan Wang
Jing Tang
Sunghun Kim
Jiayi Huang
MoE
477
70
0
26 Jun 2024
Enhancing Scientific Figure Captioning Through Cross-modal Learning
Mateo Alejandro Rojas
Rafael Carranza
193
0
0
24 Jun 2024
DevBench: A multimodal developmental benchmark for language learning
Neural Information Processing Systems (NeurIPS), 2024
A. W. M. Tan
Sunny Yu
Bria Long
Wanjing Anya Ma
Tonya Murray
Rebecca D. Silverman
Jason D. Yeatman
Michael C. Frank
256
11
0
14 Jun 2024
Zoom and Shift are All You Need
Jiahao Qin
192
3
0
13 Jun 2024
Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey
Hao Yang
Yanyan Zhao
Yang Wu
Shilong Wang
Tian Zheng
Hongbo Zhang
Zongyang Ma
Wanxiang Che
Bing Qin
351
35
0
12 Jun 2024
Labeling Comic Mischief Content in Online Videos with a Multimodal Hierarchical-Cross-Attention Model
Elaheh Baharlouei
Mahsa Shafaei
Yigeng Zhang
Hugo Jair Escalante
Thamar Solorio
217
1
0
12 Jun 2024
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness
Satyam Kumar
Sai Srujana Buddi
U. Sarawgi
Vineet Garg
Shivesh Ranjan
Ognjen
Rudovic
Ahmed Hussen Abdelaziz
Saurabh N. Adya
204
5
0
12 Jun 2024
Gentle-CLIP: Exploring Aligned Semantic In Low-Quality Multimodal Data With Soft Alignment
Zijia Song
Z. Zang
Yelin Wang
Guozheng Yang
Jiangbin Zheng
Kaicheng Yu
Wanyu Chen
Stan Z. Li
249
0
0
09 Jun 2024
Bayesian Structural Model Updating with Multimodal Variational Autoencoder
Computer Methods in Applied Mechanics and Engineering (CMAME), 2024
Tatsuya Itoi
Kazuho Amishiki
Sangwon Lee
T. Yaoyama
118
12
0
07 Jun 2024
Contextual fusion enhances robustness to image blurring
S. Joshi
Aiswarya Akumalla
S. Haney
Maxim Bazhenov
AAML
118
0
0
07 Jun 2024
AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
Trevine Oorloff
Surya Koppisetti
Nicolo Bonettini
Divyaraj Solanki
Ben Colman
Yaser Yacoob
Ali Shahriyari
Gaurav Bharaj
328
64
0
05 Jun 2024
Automatic Fused Multimodal Deep Learning for Plant Identification
Alfreds Lapkovskis
Natalia Nefedova
Ali Beikmohammadi
316
1
0
03 Jun 2024
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Ding Jia
Jianyuan Guo
Kai Han
Han Wu
Chao Zhang
Chang Xu
Xinghao Chen
ViT
511
49
0
03 Jun 2024
Previous
1
2
3
4
5
6
...
17
18
19
Next