Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.05952
Cited By
Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning
10 March 2023
Qian Jiang
Changyou Chen
Han Zhao
Liqun Chen
Q. Ping
S. D. Tran
Yi Xu
Belinda Zeng
Trishul M. Chilimbi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning"
12 / 12 papers shown
Title
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning
Sangyeon Cho
Jangyeong Jeon
Mingi Kim
Junyeong Kim
CLIP
VLM
74
0
0
30 Apr 2025
Generative Modeling of Class Probability for Multi-Modal Representation Learning
Jungkyoo Shin
Bumsoo Kim
Eunwoo Kim
48
1
0
21 Mar 2025
Neural-MCRL: Neural Multimodal Contrastive Representation Learning for EEG-based Visual Decoding
Yueyang Li
Zijian Kang
Shengyu Gong
Wenhao Dong
Weiming Zeng
Hongjie Yan
W. Siok
Nizhuan Wang
52
2
0
23 Dec 2024
Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning
William A. Stigall
43
0
0
14 Oct 2024
Fusion in Context: A Multimodal Approach to Affective State Recognition
Youssef Mohamed
Séverin Lemaignan
Arzu Guneysu
Patric Jensfelt
Christian Smith
16
0
0
18 Sep 2024
Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation
Zhekai Du
Xinyao Li
Fengling Li
Ke Lu
Lei Zhu
Jingjing Li
38
15
0
05 Mar 2024
CyCLIP: Cyclic Contrastive Language-Image Pretraining
Shashank Goel
Hritik Bansal
S. Bhatia
Ryan A. Rossi
Vishwa Vinay
Aditya Grover
CLIP
VLM
163
131
0
28 May 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
245
554
0
28 Sep 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
182
342
0
13 Jul 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
2,875
0
11 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
249
518
0
04 Feb 2021
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1