Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.06795
Cited By
Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks
13 July 2023
Denis Coquenet
Clément Rambour
Emanuele Dalsasso
Nicolas Thome
MLLM
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks"
3 / 3 papers shown
Title
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
166
676
0
22 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1