Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14897
Cited By
Text encoders bottleneck compositionality in contrastive vision-language models
24 May 2023
Amita Kamath
Jack Hessel
Kai-Wei Chang
CoGe
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Text encoders bottleneck compositionality in contrastive vision-language models"
6 / 6 papers shown
Title
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models
Dahun Kim
A. Piergiovanni
Ganesh Mallya
A. Angelova
CoGe
36
0
0
04 Apr 2025
Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models
Ketan Suhaas Saichandran
Xavier Thomas
Prakhar Kaushik
Deepti Ghadiyaram
DiffM
73
0
0
22 Mar 2025
Sensitivity of Generative VLMs to Semantically and Lexically Altered Prompts
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
VLM
16
2
0
16 Oct 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1