Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.06488
Cited By
Multimodal Learning with Transformers: A Survey
13 June 2022
P. Xu
Xiatian Zhu
David A. Clifton
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multimodal Learning with Transformers: A Survey"
50 / 268 papers shown
Title
Uncertainty-Weighted Image-Event Multimodal Fusion for Video Anomaly Detection
SungHeon Jeong
Jihong Park
Mohsen Imani
43
0
0
05 May 2025
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning
Sangyeon Cho
Jangyeong Jeon
Mingi Kim
Junyeong Kim
CLIP
VLM
74
0
0
30 Apr 2025
A multi-scale vision transformer-based multimodal GeoAI model for mapping Arctic permafrost thaw
Wenwen Li
Chia-Yu Hsu
Sizhe Wang
Zhining Gu
Yili Yang
Brendan M. Rogers
A. Liljedahl
50
0
0
23 Apr 2025
DeepMLF: Multimodal language model with learnable tokens for deep fusion in sentiment analysis
Efthymios Georgiou
V. Katsouros
Yannis Avrithis
Alexandros Potamianos
24
0
0
15 Apr 2025
Audio and Multiscale Visual Cues Driven Cross-modal Transformer for Idling Vehicle Detection
Xiwen Li
Ross T. Whitaker
Tolga Tasdizen
25
0
0
15 Apr 2025
Zeus: Zero-shot LLM Instruction for Union Segmentation in Multimodal Medical Imaging
Siyuan Dai
Kai Ye
Guodong Liu
Haoteng Tang
Liang Zhan
MedIm
19
0
0
09 Apr 2025
Foundation Models for Environmental Science: A Survey of Emerging Frontiers
Runlong Yu
Shengyu Chen
Yiqun Xie
Huaxiu Yao
J. Willard
X. Jia
AI4CE
31
0
0
05 Apr 2025
ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving
Sheng Yang
Tong Zhan
Shichen Qiao
Jicheng Gong
Qing Yang
Jian Wang
Yanfeng Lu
3DPC
36
0
0
04 Apr 2025
FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention
Huangliang Dai
Shixun Wu
Hairui Zhao
Jiajun Huang
Zizhe Jian
Yue Zhu
Haiyang Hu
Zizhong Chen
41
0
0
03 Apr 2025
Beyond Unimodal Boundaries: Generative Recommendation with Multimodal Semantics
Jing Zhu
Mingxuan Ju
Yozen Liu
Danai Koutra
Neil Shah
Tong Zhao
36
0
0
30 Mar 2025
Quantum Complex-Valued Self-Attention Model
Fu Chen
Qinglin Zhao
Li Feng
Longfei Tang
Yangbin Lin
Haitao Huang
MQ
47
0
0
24 Mar 2025
Continual Multimodal Contrastive Learning
Xiaohao Liu
Xiaobo Xia
See-Kiong Ng
Tat-Seng Chua
CLL
54
0
0
19 Mar 2025
Aligning Vision to Language: Text-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning
Junming Liu
Siyuan Meng
Yanting Gao
Song Mao
Pinlong Cai
Guohang Yan
Yirong Chen
Zilin Bian
Botian Shi
Ding Wang
41
1
0
17 Mar 2025
DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning
Chengxuan Qian
Shuo Xing
Shawn Li
Yue Zhao
Zhengzhong Tu
46
0
0
14 Mar 2025
Beam Selection in ISAC using Contextual Bandit with Multi-modal Transformer and Transfer Learning
Mohammad Farzanullah
Han Zhang
A. B. Sediq
Ali Afana
Melike Erol-Kantarci
41
0
0
13 Mar 2025
DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning
Chengxuan Qian
Kai Han
J. Wang
Zhenlong Yuan
Rui Qian
Chongwen Lyu
Jun Chen
39
1
0
09 Mar 2025
Robust Multi-View Learning via Representation Fusion of Sample-Level Attention and Alignment of Simulated Perturbation
Jie Xu
Na Zhao
Gang Niu
Masashi Sugiyama
Xiaofeng Zhu
72
0
0
06 Mar 2025
A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery
Yiheng Zhu
Mingyang Li
Junlong Liu
Kun Fu
J. Wu
Q. Li
Mingze Yin
Jieping Ye
Jian Wu
Z. Wang
58
0
0
06 Mar 2025
Deep Causal Behavioral Policy Learning: Applications to Healthcare
Jonas Knecht
Anna Zink
Jonathan Kolstad
Maya Petersen
CML
75
0
0
05 Mar 2025
A Survey of Foundation Models for Environmental Science
Runlong Yu
Shengyu Chen
Yiqun Xie
X. Jia
AI4CE
48
1
0
05 Mar 2025
Attention Bootstrapping for Multi-Modal Test-Time Adaptation
Yusheng Zhao
Junyu Luo
Xiao Luo
Jinsheng Huang
Jingyang Yuan
Zhiping Xiao
M. Zhang
TTA
85
0
0
04 Mar 2025
Split Adaptation for Pre-trained Vision Transformers
Lixu Wang
Bingqi Shang
Y. Li
Payal Mohapatra
Wei Dong
Xiao-Xu Wang
Qi Zhu
ViT
40
0
0
01 Mar 2025
Multimodal Learning for Just-In-Time Software Defect Prediction in Autonomous Driving Systems
Faisal Mohammad
Duksan Ryu
54
0
0
28 Feb 2025
What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning Methods
Christian Gapp
Elias Tappeiner
M. Welk
Karl Fritscher
Elke Ruth Gizewski
R. Schubert
37
0
0
28 Feb 2025
Integrating Biological and Machine Intelligence: Attention Mechanisms in Brain-Computer Interfaces
J. Wang
Weishan Ye
Jialin He
Li Zhang
G. Huang
Zhuliang Yu
Zhen Liang
70
0
0
26 Feb 2025
Simpler Fast Vision Transformers with a Jumbo CLS Token
A. Fuller
Yousef Yassin
Daniel G. Kyrollos
Evan Shelhamer
James R. Green
67
0
0
24 Feb 2025
GeoAggregator: An Efficient Transformer Model for Geo-Spatial Tabular Data
Rui Deng
Ziqi Li
Mingshu Wang
28
0
0
24 Feb 2025
A Multimodal PDE Foundation Model for Prediction and Scientific Text Descriptions
Elisa Negrini
Yuxuan Liu
Liu Yang
Stanley Osher
Hayden Schaeffer
AI4CE
79
0
0
09 Feb 2025
High-dimensional multimodal uncertainty estimation by manifold alignment:Application to 3D right ventricular strain computations
Maxime Di Folco
Gabriel Bernardino
Patrick Clarysse
Nicolas Duchateau
62
1
0
21 Jan 2025
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
44
3
0
31 Dec 2024
Multimodal Fusion and Coherence Modeling for Video Topic Segmentation
Hai Yu
Chong Deng
Qinglin Zhang
Jiaqing Liu
Qian Chen
Wen Wang
50
0
0
31 Dec 2024
When SAM2 Meets Video Shadow and Mirror Detection
Leiping Jie
VLM
27
1
0
26 Dec 2024
Bag of Tricks for Multimodal AutoML with Image, Text, and Tabular Data
Zhiqiang Tang
Zihan Zhong
Tong He
Gerald Friedland
73
0
0
19 Dec 2024
Deep Learning-Based Noninvasive Screening of Type 2 Diabetes with Chest X-ray Images and Electronic Health Records
Sanjana Gundapaneni
Zhuo Zhi
Miguel R. D. Rodrigues
76
0
0
14 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
96
14
0
03 Dec 2024
Graph-to-SFILES: Control structure prediction from process topologies using generative artificial intelligence
Lukas Schulze Balhorn
Kevin Degens
Artur M. Schweidtmann
AI4CE
64
0
0
30 Nov 2024
Multimodal Integration of Longitudinal Noninvasive Diagnostics for Survival Prediction in Immunotherapy Using Deep Learning
Melda Yeghaian
Zuhir Bodalal
Daan van den Broek
John B A G Haanen
Regina G H Beets-Tan
Stefano Trebeschi
Marcel A J van Gerven
63
0
0
27 Nov 2024
FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval
Jingyou Xie
Jiayi Kuang
Zhenzhou Lin
Jiarui Ouyang
Zishuo Zhao
Ying Shen
VLM
CLIP
59
0
0
26 Nov 2024
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
68
0
0
24 Nov 2024
Silver medal Solution for Image Matching Challenge 2024
Yian Wang
3DV
3DPC
29
0
0
04 Nov 2024
JEMA: A Joint Embedding Framework for Scalable Co-Learning with Multimodal Alignment
Joao Sousa
Roya Darabi
A. A. Sousa
Frank Brueckner
Luís Paulo Reis
Ana Reis
16
1
0
31 Oct 2024
Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
Manuel Benavent-Lledo
David Mulero-Pérez
David Ortiz-Perez
José García Rodríguez
Antonis Argyros
24
0
0
28 Oct 2024
Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
Avinash Maurya
Jie Ye
M. Rafique
Franck Cappello
Bogdan Nicolae
16
1
0
26 Oct 2024
Graph Linearization Methods for Reasoning on Graphs with Large Language Models
Christos Xypolopoulos
Guokan Shang
Xiao Fei
Giannis Nikolentzos
Hadi Abdine
Iakovos Evdaimon
Michail Chatzianastasis
Giorgos Stamou
Michalis Vazirgiannis
19
1
0
25 Oct 2024
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques
David Ortiz-Perez
Manuel Benavent-Lledo
José García Rodríguez
David Tomás
M. Flores Vizcaya-Moreno
18
0
0
24 Oct 2024
FedBaF: Federated Learning Aggregation Biased by a Foundation Model
Jong-Ik Park
Srinivasa Pranav
J. M. F. Moura
Carlee Joe-Wong
AI4CE
68
2
0
24 Oct 2024
Multi-Modal Transformer and Reinforcement Learning-based Beam Management
Mohammad Ghassemi
Han Zhang
Ali Afana
A. B. Sediq
Melike Erol-Kantarci
OffRL
17
3
0
22 Oct 2024
Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation
Andong Lu
Jiacong Zhao
Chenglong Li
Yun Xiao
B. Luo
52
3
0
15 Oct 2024
Exploring Foundation Models in Remote Sensing Image Change Detection: A Comprehensive Survey
Zihan Yu
Tianxiao Li
Yuxin Zhu
Rongze Pan
30
0
0
10 Oct 2024
Recent Advances of Multimodal Continual Learning: A Comprehensive Survey
Dianzhi Yu
Xinni Zhang
Yankai Chen
Aiwei Liu
Yifei Zhang
Philip S. Yu
Irwin King
VLM
CLL
36
9
0
07 Oct 2024
1
2
3
4
5
6
Next