ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.10537
  4. Cited By
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
v1v2 (latest)

Does CLIP Bind Concepts? Probing Compositionality in Large Image Models

Findings (Findings), 2022
20 December 2022
Martha Lewis
Nihal V. Nayak
Peilin Yu
Qinan Yu
Jack Merullo
Stephen H. Bach
Ellie Pavlick
    VLMOCLCoGe
ArXiv (abs)PDFHTML

Papers citing "Does CLIP Bind Concepts? Probing Compositionality in Large Image Models"

50 / 64 papers shown
Title
SpaceVLM: Sub-Space Modeling of Negation in Vision-Language Models
SpaceVLM: Sub-Space Modeling of Negation in Vision-Language Models
Sepehr Kazemi Ranjbar
Kumail Alhamoud
Marzyeh Ghassemi
VLM
52
0
0
15 Nov 2025
Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency PartitionComputer Vision and Pattern Recognition (CVPR), 2025
Lintong Zhang
Kang Yin
Seong-Whan Lee
FAtt
308
0
0
11 Nov 2025
HiMo-CLIP: Modeling Semantic Hierarchy and Monotonicity in Vision-Language Alignment
HiMo-CLIP: Modeling Semantic Hierarchy and Monotonicity in Vision-Language Alignment
Ruijia Wu
Ping Chen
Fei Shen
Shaoan Zhao
Qiang Hui
...
Ting Lu
Zhaoxiang Liu
Fang Zhao
Kai Wang
Shiguo Lian
VLM
128
0
0
10 Nov 2025
Referring Expressions as a Lens into Spatial Language Grounding in Vision-Language Models
Referring Expressions as a Lens into Spatial Language Grounding in Vision-Language Models
Akshar Tumu
Varad Shinde
Parisa Kordjamshidi
56
0
0
08 Nov 2025
On the Brittleness of CLIP Text Encoders
On the Brittleness of CLIP Text Encoders
Allie Tran
Luca Rossetto
124
0
0
06 Nov 2025
DisCoCLIP: A Distributional Compositional Tensor Network Encoder for Vision-Language Understanding
DisCoCLIP: A Distributional Compositional Tensor Network Encoder for Vision-Language Understanding
K. Lo
Hala Hawashin
Mina Abbaszadeh
Tilen Limback-Stokin
Hadi Wazni
M. Sadrzadeh
CLIPCoGe
149
0
0
25 Sep 2025
ORIC: Benchmarking Object Recognition under Contextual Incongruity in Large Vision-Language Models
ORIC: Benchmarking Object Recognition under Contextual Incongruity in Large Vision-Language Models
Zhaoyang Li
Z. Ling
Yuchen Zhou
Litian Gong
Erdem Bıyık
H. Su
143
0
0
19 Sep 2025
Compositional Concept Generalization with Variational Quantum Circuits
Compositional Concept Generalization with Variational Quantum Circuits
Hala Hawashin
Mina Abbaszadeh
Nicholas Joseph
Beth Pearson
Martha Lewis
Mehrnoosh Sadrzadeh
CoGe
46
0
0
11 Sep 2025
Evaluating Compositional Generalisation in VLMs and Diffusion Models
Evaluating Compositional Generalisation in VLMs and Diffusion Models
Beth Pearson
Bilal Boulbarss
Michael Wray
Martha Lewis
DiffMCoGe
79
1
0
28 Aug 2025
Explaining Similarity in Vision-Language Encoders with Weighted Banzhaf Interactions
Explaining Similarity in Vision-Language Encoders with Weighted Banzhaf Interactions
Hubert Baniecki
Maximilian Muschalik
Fabian Fumagalli
Barbara Hammer
Eyke Hüllermeier
P. Biecek
FAtt
158
0
0
07 Aug 2025
Common Data Properties Limit Object-Attribute Binding in CLIP
Common Data Properties Limit Object-Attribute Binding in CLIP
Bijay Gurung
David T. Hoffmann
Thomas Brox
VLM
158
0
0
10 Jul 2025
Scaling can lead to compositional generalization
Scaling can lead to compositional generalization
Florian Redhardt
Yassir Akram
Simon Schug
GNNCoGe
183
0
0
09 Jul 2025
Visual symbolic mechanisms: Emergent symbol processing in vision language models
Visual symbolic mechanisms: Emergent symbol processing in vision language models
Rim Assouel
Declan Campbell
Taylor Webb
114
2
0
18 Jun 2025
Adding simple structure at inference improves Vision-Language Compositionality
Adding simple structure at inference improves Vision-Language Compositionality
Imanol Miranda
Ander Salaberria
Eneko Agirre
Gorka Azkune
CoGeVLM
176
0
0
11 Jun 2025
GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra
Mateusz Michalkiewicz
Anekha Sokhal
Tadeusz Michalkiewicz
Piotr Pawlikowski
Mahsa Baktashmotlagh
Varun Jampani
Guha Balakrishnan
174
0
0
09 Jun 2025
LLMs Can Compensate for Deficiencies in Visual Representations
LLMs Can Compensate for Deficiencies in Visual Representations
Sho Takishita
Jay Gala
Abdelrahman Mohamed
Kentaro Inui
Yova Kementchedjhieva
VLM
179
0
0
05 Jun 2025
Behavioural vs. Representational Systematicity in End-to-End Models: An Opinionated SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ivan Vegner
Sydelle de Souza
Valentin Forch
Martha Lewis
Leonidas A.A. Doumas
182
3
0
04 Jun 2025
Compositional Image-Text Matching and Retrieval by Grounding Entities
Compositional Image-Text Matching and Retrieval by Grounding Entities
Madhukar Reddy Vongala
Saurabh Srivastava
Jana Kosecka
CLIPCoGeVLM
170
0
0
04 May 2025
VSC: Visual Search Compositional Text-to-Image Diffusion Model
VSC: Visual Search Compositional Text-to-Image Diffusion Model
Do Huu Dat
Nam Hyeonu
Po Yuan Mao
Tae-Hyun Oh
DiffMCoGe
221
1
0
02 May 2025
Human-like compositional learning of visually-grounded concepts using synthetic environments
Human-like compositional learning of visually-grounded concepts using synthetic environments
Zijun Lin
M Ganesh Kumar
Cheston Tan
OCLCoGe
324
0
0
09 Apr 2025
Evaluating Compositional Scene Understanding in Multimodal Generative Models
Evaluating Compositional Scene Understanding in Multimodal Generative Models
Shuhao Fu
Andrew Jun Lee
Anna Wang
Ida Momennejad
Trevor Bihl
Hongjing Lu
Taylor Webb
CoGeOCL
256
3
0
29 Mar 2025
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2025
Davide Berasi
Matteo Farina
Goran Frehse
Elisa Ricci
Nicola Strisciuglio
CoGe
229
2
0
21 Mar 2025
Dynamic Relation Inference via Verb Embeddings
Dynamic Relation Inference via Verb Embeddings
Omri Suissa
Muhiim Ali
Ariana Azarbal
Hui Shen
Shekhar Pradhan
315
0
0
17 Mar 2025
On the Limitations of Vision-Language Models in Understanding Image Transforms
Ahmad Mustafa Anis
Hasnain Ali
Saquib Sarfraz
VLM
521
6
0
12 Mar 2025
Is CLIP ideal? No. Can we fix it? Yes!
Raphi Kang
Yue Song
Georgia Gkioxari
Pietro Perona
VLM
272
4
0
10 Mar 2025
Bayesian Fields: Task-driven Open-Set Semantic Gaussian Splatting
Dominic Maggio
Luca Carlone
766
1
0
07 Mar 2025
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Vivek Myers
Bill Chunyuan Zheng
Anca Dragan
Kuan Fang
Sergey Levine
444
5
0
08 Feb 2025
Learning to Reason Iteratively and Parallelly for Complex Visual
  Reasoning Scenarios
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning ScenariosNeural Information Processing Systems (NeurIPS), 2024
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLMLRM
307
4
0
20 Nov 2024
ResiDual Transformer Alignment with Spectral Decomposition
ResiDual Transformer Alignment with Spectral Decomposition
Lorenzo Basile
Valentino Maiorca
Luca Bortolussi
Emanuele Rodolà
Francesco Locatello
461
3
0
31 Oct 2024
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem
Understanding the Limits of Vision Language Models Through the Lens of the Binding ProblemNeural Information Processing Systems (NeurIPS), 2024
Declan Campbell
Sunayana Rane
Tyler Giallanza
Nicolò De Sabbata
Kia Ghods
...
Alexander Ku
Steven M. Frankland
Thomas Griffiths
Jonathan D. Cohen
Taylor W. Webb
345
49
0
31 Oct 2024
A Complexity-Based Theory of Compositionality
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Yoshua Bengio
Guillaume Lajoie
CoGe
610
17
0
18 Oct 2024
Swing-by Dynamics in Concept Learning and Compositional Generalization
Swing-by Dynamics in Concept Learning and Compositional GeneralizationInternational Conference on Learning Representations (ICLR), 2024
Yongyi Yang
Core Francisco Park
Ekdeep Singh Lubana
Maya Okawa
Wei Hu
Hidenori Tanaka
CoGeDiffM
230
0
0
10 Oct 2024
Do Pre-trained Vision-Language Models Encode Object States?
Do Pre-trained Vision-Language Models Encode Object States?
Kaleb Newman
Shijie Wang
Yuan Zang
David Heffren
Chen Sun
CoGe
184
5
0
16 Sep 2024
Finetuning CLIP to Reason about Pairwise Differences
Finetuning CLIP to Reason about Pairwise Differences
Dylan Sam
Devin Willmott
João Dias Semedo
J. Zico Kolter
VLM
269
8
0
15 Sep 2024
What happens to diffusion model likelihood when your model is
  conditional?
What happens to diffusion model likelihood when your model is conditional?
Mattias Cross
Anton Ragni
DiffM
209
0
0
10 Sep 2024
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
Manu Gaur
Darshan Singh
Makarand Tapaswi
807
2
0
04 Sep 2024
Relational Composition in Neural Networks: A Survey and Call to Action
Relational Composition in Neural Networks: A Survey and Call to Action
Martin Wattenberg
Fernanda Viégas
CoGe
146
15
0
19 Jul 2024
Towards Compositionality in Concept Learning
Towards Compositionality in Concept Learning
Adam Stein
Aaditya Naik
Yinjun Wu
Mayur Naik
Eric Wong
CoGe
266
8
0
26 Jun 2024
Improving Interpretability and Robustness for the Detection of
  AI-Generated Images
Improving Interpretability and Robustness for the Detection of AI-Generated Images
T. Gaintseva
Laida Kushnareva
German Magai
Irina Piontkovskaya
Sergey I. Nikolenko
Ziquan Liu
S. Barannikov
Gregory Slabaugh
163
1
0
21 Jun 2024
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment
Darshana Saravanan
Darshan Singh
Varun Gupta
Zeeshan Khan
Vineet Gandhi
Makarand Tapaswi
CoGe
109
2
0
16 Jun 2024
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
Wufei Ma
Guanning Zeng
Guofeng Zhang
Qihao Liu
Letian Zhang
Adam Kortylewski
Yaoyao Liu
Alan Yuille
VLM3DV
178
15
0
13 Jun 2024
When does compositional structure yield compositional generalization? A kernel theory
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAICoGe
495
13
0
26 May 2024
From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks
From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks
Jacob Russin
Sam Whitman McGrath
Danielle J. Williams
AI4CE
413
6
0
24 May 2024
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly
  Segmentation
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation
Kevin Stangl
Marius Arvinte
Weilin Xu
Cory Cornelius
VLMUQCV
197
2
0
13 May 2024
A Philosophical Introduction to Language Models - Part II: The Way
  Forward
A Philosophical Introduction to Language Models - Part II: The Way Forward
Raphael Milliere
Cameron Buckner
LRM
206
24
0
06 May 2024
Improving Concept Alignment in Vision-Language Concept Bottleneck Models
Improving Concept Alignment in Vision-Language Concept Bottleneck Models
Nithish Muthuchamy Selvaraj
Xiaobao Guo
Bingquan Shen
A. Kong
Alex C. Kot
VLM
246
0
0
03 May 2024
Pre-trained Vision-Language Models Learn Discoverable Visual Concepts
Pre-trained Vision-Language Models Learn Discoverable Visual Concepts
Yuan Zang
Tian Yun
Hao Tan
Trung Bui
Chen Sun
VLMCoGe
258
14
0
19 Apr 2024
Probing the 3D Awareness of Visual Foundation Models
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas Guibas
Justin Johnson
Varun Jampani
235
121
0
12 Apr 2024
Language Plays a Pivotal Role in the Object-Attribute Compositional
  Generalization of CLIP
Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP
Reza Abbasi
Mohammad Samiei
M. Rohban
M. Baghshah
VLMCoGe
157
0
0
27 Mar 2024
If CLIP Could Talk: Understanding Vision-Language Model Representations
  Through Their Preferred Concept Descriptions
If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions
Reza Esfandiarpoor
Cristina Menghini
Stephen H. Bach
CoGeVLM
262
15
0
25 Mar 2024
12
Next