Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.12139
Cited By
Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models
18 April 2024
Shouwei Ruan
Yinpeng Dong
Hanqing Liu
Yao Huang
Hang Su
Xingxing Wei
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models"
6 / 6 papers shown
Title
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Christian Schlarmann
Naman D. Singh
Francesco Croce
Matthias Hein
VLM
AAML
39
36
0
19 Feb 2024
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Quan-Sen Sun
Jinsheng Wang
Qiying Yu
Yufeng Cui
Fan Zhang
Xiaosong Zhang
Xinlong Wang
VLM
CLIP
MLLM
83
38
0
06 Feb 2024
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,010
0
28 Jan 2022
ABO: Dataset and Benchmarks for Real-World 3D Object Understanding
Jasmine Collins
Shubham Goel
Kenan Deng
Achleshwar Luthra
Leon L. Xu
...
T. F. Y. Vicente
T. Dideriksen
H. Arora
M. Guillaumin
Jitendra Malik
146
216
0
12 Oct 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1