Understanding and Constructing Latent Modality Structures in Multi-modal
Representation Learning

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

10 March 2023

Trishul M. Chilimbi

Papers citing "Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning"

12 / 12 papers shown

Title
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning Sangyeon Cho Jangyeong Jeon Mingi Kim Junyeong Kim CLIP VLM 74 0 0 30 Apr 2025
Generative Modeling of Class Probability for Multi-Modal Representation Learning Jungkyoo Shin Bumsoo Kim Eunwoo Kim 48 1 0 21 Mar 2025
Neural-MCRL: Neural Multimodal Contrastive Representation Learning for EEG-based Visual Decoding Yueyang Li Zijian Kang Shengyu Gong Wenhao Dong Weiming Zeng Hongjie Yan W. Siok Nizhuan Wang 52 2 0 23 Dec 2024
Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning William A. Stigall 43 0 0 14 Oct 2024
Fusion in Context: A Multimodal Approach to Affective State Recognition Youssef Mohamed Séverin Lemaignan Arzu Guneysu Patric Jensfelt Christian Smith 16 0 0 18 Sep 2024
Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation Zhekai Du Xinyao Li Fengling Li Ke Lu Lei Zhu Jingjing Li 38 15 0 05 Mar 2024
CyCLIP: Cyclic Contrastive Language-Image Pretraining Shashank Goel Hritik Bansal S. Bhatia Ryan A. Rossi Vishwa Vinay Aditya Grover CLIP VLM 163 131 0 28 May 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding Hu Xu Gargi Ghosh Po-Yao (Bernie) Huang Dmytro Okhonko Armen Aghajanyan Florian Metze Luke Zettlemoyer Florian Metze Luke Zettlemoyer Christoph Feichtenhofer CLIP VLM 245 554 0 28 Sep 2021
How Much Can CLIP Benefit Vision-and-Language Tasks? Sheng Shen Liunian Harold Li Hao Tan Mohit Bansal Anna Rohrbach Kai-Wei Chang Z. Yao Kurt Keutzer CLIP VLM MLLM 182 342 0 13 Jul 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 293 2,875 0 11 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation Jaemin Cho Jie Lei Hao Tan Mohit Bansal MLLM 249 518 0 04 Feb 2021
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 279 39,083 0 01 Sep 2014