Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12315
Cited By
ModalChorus: Visual Probing and Alignment of Multi-modal Embeddings via Modal Fusion Map
17 July 2024
Yilin Ye
Shishi Xiao
Xingchen Zeng
Wei Zeng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ModalChorus: Visual Probing and Alignment of Multi-modal Embeddings via Modal Fusion Map"
5 / 5 papers shown
Title
The Contemporary Art of Image Search: Iterative User Intent Expansion via Vision-Language Model
Yilin Ye
Qian Zhu
Shishi Xiao
Kang Zhang
Wei Zeng
28
3
0
04 Dec 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
CLIP-PAE: Projection-Augmentation Embedding to Extract Relevant Features for a Disentangled, Interpretable, and Controllable Text-Guided Face Manipulation
Chenliang Zhou
Fangcheng Zhong
Cengiz Öztireli
CLIP
40
19
0
08 Oct 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
79
208
0
18 Feb 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1